Edward R. Tufte 


The Visual Display 
of Quantitative Information 


SECOND EDITION 


Graphics Press ° Cheshire, Connecticut 


Copyright © 2001 by Edward Rolf Tufte 
PUBLISHED BY GRAPHICS PRESS LLC © 
PosT OFFICE Box 430, CHESHIRE, CONNECTICUT 06410 


All rights to illustrations and text reserved by Edward Rolf Tufte. This work may not be copied, reproduced, or translated in whole or in 

part without written permission of the publisher, except for brief excerpts in connection with reviews or scholarly analysis. Use with any form 
of information storage and retrieval, electronic adaptation or whatever, computer software, or by similar or dissimilar methods now known 
or developed in the future is also strictly forbidden without written permission of the publisher. A number of illustrations are reproduced by 
permission; those copyright-holders are credited on page 197. 


Printed in the United States of America Second edition, fifth printing, August 2007 


Contents 


PART I GRAPHICAL PRACTICE 


1 


2 


3 


Graphical Excellence 13 
Graphical Integrity $3 
Sources of Graphical Integrity and Sophistication 79 


PART П THEORY OF DATA GRAPHICS 


4 


5 
6 
7 
8 


9 


Data-Ink and Graphical Redesign 91 

Chartjunk: Vibrations, Grids, and Ducks 107 

Data-Ink Maximization and Graphical Design 123 
Multifunctioning Graphical Elements 139 

Data Density and Small Multiples 161 

Aesthetics and Technique in Data Graphical Design 177 


Epilogue: Designs for the Display of Information 191 


For my parents 
Edward E. Tufte and Virginia James Tufte 


To the memory of 
John W. Tukey (1915-2000) 


Introduction to the Second Edition 


This new edition provides high-resolution color reproductions of 
the many graphics of William Playfair, adds color to other images 
where appropriate, and includes all the changes and corrections 
accumulated during the 17 printings of the first edition. 


This book began in 1975 when Dean Donald Stokes of Princeton’s 
Woodrow Wilson School asked me to teach statistics to a dozen 
journalists who were visiting that year to learn some economics. 

І annotated a collection of readings, with a long section on 
statistical graphics. The literature here was thin, too often grimly 
devoted to explaining use of the ruling pen and to promulgating 
"graphic standards" indifferent to the nature of visual evidence and 
quantitative reasoning. Soon I wrote up some ideas. Then John 
Tukey, the phenomenal Princeton statistician, suggested that we 
give a series of joint seminars. Since the mid-1960s, Tukey had 
opened up the field, as his brilliant technical contributions made 

it clear that the study of statistical graphics was intellectually 
respectable and not just about pie charts and ruling pens. 

After moving to Yale University, I finished the manuscript in 
1982. A publisher was interested but planned to print only 2,000 
copies and to charge a very high price, contrary to my hopes for 
a wide readership. I also sought to design the book so as to make 
it self-exemplifying—that is, the physical object itself would reflect 
the intellectual principles advanced in the book. Publishers seemed 
appalled at the prospect that an author might govern design. 

Consequently I investigated self-publishing. This required a first- 
rate book designer, a lot of money (at least for a young professor), 
and a large garage. I found Howard Gralla who had designed 
many museum catalogs with great care and craft. He was willing to 
work closely with this difficult author who was filled with all sorts 
of opinions about design and typography. We spent the summer in 


his studio laying out the book, page by page. We were able to 
integrate graphics right into the text, sometimes into the middle 
"of a sentence, eliminating the usual separation of text and image— 
one of the ideas Visual Display advocated. To finance the book 

I took out another mortgage on my home. The bank officer said 
this was the second most unusual loan that she had ever made; first 
place belonged to a loan to a circus to buy an elephant! 

My view on self-publishing was to go all out, to make the best 
and most elegant and wonderful book possible, without compromise. 
Otherwise, why do it? 

Most of all, the book, as a thing in itself, gave to me fresh new 
eyes for the intellectual and aesthetic joy of visual evidence, visual 
reasoning, and visual understanding. 


January 2001 
Cheshire, Connecticut 


Introduction 


Data graphics visually display measured quantities by means of 
the combined use of points, lines, a coordinate system, numbers, 
symbols, words, shading, and color. 

The use of abstract, non-representational pictures to show numbers 
is a surprisingly recent invention, perhaps because of the diversity 
of skills required — the visual-artistic, empirical-statistical, and 
mathematical. It was not until 1750-1800 that statistical graphics— 
length and area to show quantity, time-series, scatterplots, and 
multivariate displays— were invented, long after such triumphs of 
mathematical ingenuity as logarithms, Cartesian coordinates, the 
calculus, and the basics of probability theory. The remarkable 
William Playfair (1759-1823) developed or improved upon nearly 
all the fundamental graphical designs, seeking to replace conven- 
tional tables of numbers with the systematic visual representations 
of his "linear arithmetic." 

Modern data graphics can do much more than simply substitute 
for small statistical tables. At their best, graphics are instruments 
for reasoning about quantitative information. Often the most effec- 
tive way to describe, explore, and summarize a set of numbers— 
even a very large set—is to look at pictures of those numbers. 
Furthermore, of all methods for analyzing and communicating 
statistical information, well-designed data graphics are usually the 
simplest and at the same time the most powerful. 


The first part of this book reviews the graphical practice of the 
two centuries since Playfair. The reader will, I hope, rejoice in the 
graphical glories shown in Chapter 1 and then condemn the lapses 
and lost opportunities exhibited in Chapter 2. Chapter 3, on graph- 
ical integrity and sophistication, seeks to account for these differ- 
ences in quality of graphical design. 


The second part of the book provides a language for discussing 
graphics and a practical theory of data graphics. Applying to most 
visual displays of quantitative information, the theory leads to 
changes and improvements in design, suggests why some graphics 
might be better than others, and generates new types of graphics. 
The emphasis is on maximizing principles, empirical measures of 
graphical performance, and the sequential improvement of graphics 
through revision and editing. Insights into graphical design are to 
be gained, I believe, from theories of what makes for excellence 
in art, architecture, and prose. 


This is a book about the design of statistical graphics and, as such, 
it is concerned both with design and with statistics. But it is also 
about how to communicate information through the simultaneous 
presentation of words, numbers, and pictures. The design of statis- 
tical graphics is a universal matter—like mathematics—and is not 
tied to the unique features of a particular language. The descriptive 
concepts (a vocabulary for graphics) and the principles advanced 
apply to most designs. I have at times provided evidence about the 
scope of these ideas, by showing how frequently a principle applies 
to (a random sample of) news and scientific graphics. 

Each year, the world over, somewhere between 900 billion 
(95x 1011) and 2 trillion (25 102) images of statistical graphics are 
printed. The principles of this book apply to most of those graphics. 
Some of the suggested changes are small, but others are substantial, 
with consequences for hundreds of billions of printed pages. 

But I hope also that the book has consequences for the viewers and 
makers of those images—that they will never view or create statis- 
tical graphics the same way again. That is in part because we are 
about to see, collected here, so many wonderful drawings, those 
of Playfair, of Minard, of Marey, and, nowadays, of the computer. 

Most of all, then, this book is a celebration of data graphics. 


PART I 


Graphical Practice 


г Graphical Excellence 


Excellence in statistical graphics consists of complex ideas 
communicated with clarity, precision, and efficiency. Graphical 


displays should 


• show the data 


induce the viewer to think about the substance rather than about 
methodology, graphic design, the technology of graphic pro- 
duction, or sometbing else 


avoid distorting what the data have to say 


present many numbers in a small space 


make large data sets coherent 


. 


encourage the eye to compare different pieces of data 


reveal the data at several levels of detail, from a broad overview 
to the fine structure 


serve a reasonably clear purpose: description, exploration, 
tabulation, or decoration 


be closely integrated with the statistical and verbal descriptions 
of a data set. 


Graphics reveal data. Indeed graphics can be more precise and 
revealing than conventional statistical computations. Consider 
Anscombe's quartet: all four of these data sets are described by 
exactly the same linear model (at least until the residuals are ex- 


amined). 
I п I1I IV 
x Y x Y x Y x Y 
10.0 8.04 10.0 9.14 10.0 7.46 80 6.58 N = 11 
80 6.95 80 8.14 8.0 6.77 8.0 5.76 mean of X’s = 9.0 
13.0 7.58 13.0 8.74 13.0 12.74 80 7.71 mean of Y’s = 7.5 
90 8.81 90 8.77 90 741 80 884 equation of regression line: Y = 3--0.5X 
11.0 8.33 11.0 9.26 110 7.81 80 8.47 standard error of estimate of slope — 0.118 
140 9.96 140 810 140 884 80 704 = 4.24 
60 7.24 6.0 6.13 6.0 6.08 8.0 5.25 sum of squares X — X = 110.0 
40 4.26 40 3.10 40 5.39 19.0 12.50 regression sum of squares — 27.50 
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 residual sum of squares of Y = 13.75 
70 482 70 726 7.0 6.42 80 791 correlation coefficient — .82 


5.0 5.68 5.0 474 5.0 5.73 80 6.89 r? = 67 


14 GRAPHICAL PRACTICE 


And yet how they differ, as the graphical display of the data 


makes vividly clear: 


F. J. Anscombe, "Graphs in Statistical 
Analysis," American Statistician, 27 
(February 1973), 17-21. 


I IT 
10 Ы е " 
eee 
ae 5 ^ ut 
° »م‎ 
54 Ma е e 
е 
10 20 А 
Ш e. IV e 
ө 
gon? e 
ee? 1 


And likewise a graphic easily reveals point A, a wildshot obser- 
vation that will dominate standard statistical calculations. Note that 
point A hides in the marginal distribution but shows up as clearly 
exceptional in the bivariate scatter. 


Stephen S. Brier and Stephen Е. Fien- 
berg, "Recent Econometric Modelling 
of Crime and Punishment: Support for 
the Deterrence Hypothesis?" in Stephen 
E. Fienberg and Albert J. Reiss, Jr., eds., 
Indicators of Crime and Criminal Justice: 
Quantitative Studies (Washington, D.C., 
1980), p. 89. 


GRAPHICAL EXCELLENCE 15 


Of course, statistical graphics, just like statistical calculations, are Edward R. Dewey and Edwin F. Dakin, 
Cycles: The Science of Prediction (New 


ood as what goes into them. An ill-specified or prepos- 
only as g 8 P prep York, 1947), p. 144. 


terous model or a puny data set cannot be rescued by a graphic 
(or by calculation), no matter how clever or fancy. A silly theory 
means a silly graphic: 


New York Stock Prices 
Ы 
Vd ~ 


~ 
== 
~~? 


Solar Radiation 


— —á 
London Stock Prices 


Calories per sq. cm. per minute 


Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec. 


SOLAR RADIATION AND STOCK PRICES 


A. New York stock prices (Barron’s average). B. Solar Radiation, inverted, 
and C. London stock prices, all by months, 1929 (after Garcia-Mata and 
Shaffner) . 


Let us turn to the practice of graphical excellence, the efficient 
communication of complex quantitative ideas. Excellence, nearly 
always of a multivariate sort, is illustrated here for fundamental 
graphical designs: data maps, time-series, space-time narrative 
designs, and relational graphics. These examples serve several 
purposes, providing a sct of high-quality graphics that can be 
discussed (and sometimes even redrawn) in constructing a theory 
of data graphics, helping to demonstrate a descriptive terminology, 
and telling in brief about the history of graphical development. 
Most of all, we will be able to see just how good statistical 
graphics can be. 


16 GRAPHICAL PRACTICE 


Data Maps 


These six maps report the age-adjusted death rate from various 
types of cancer for the 3,056 counties of the United States. Each 
map portrays some 21,000 numbers.! Only a picture can carry such 
a volume of data in such a small space. Furthermore, all that data, 
thanks to the graphic, can be thought about in many different 
ways at many different levels of analysis—ranging from the con- 
templation of general overall patterns to the detection of very 

fine county-by-county detail. To take just a few examples, look 

at the 


* high death rates from cancer in the northeast part of the country 
and around the Great Lakes 


* low rates in an east-west band across the middle of the country 


* higher rates for men than for women in the south, particularly 
Louisiana (cancers probably caused by occupational exposure, 
from working with asbestos in shipyards) 


+ unusual hot spots, including northern Minnesota and a few 
counties in Iowa and Nebraska along the Missouri River 


* differences in types of cancer by region (for example, the high 
rates of stomach cancer in the north-central part of the country 
— probably the result of the consumption of smoked fish by 
Scandinavians) 


* rates in areas where you have lived. 


The maps provide many leads into the causes—and avoidance— 
of cancer. For example, the authors report: 


In certain situations . . . the unusual experience of a county 
warrants further investigation. For example, Salem County, 
New Jersey, leads the nation in bladder cancer mortality 
among white men. We attribute this excess risk to occupational 
exposures, since about 25 percent of the employed persons in 
this county work in the chemical industry, particularly the 
manufacturing of organic chemicals, which may cause bladder 
tumors. After the finding was communicated to New Jersey 
health officials, a company in the area reported that at least 330 
workers in a single plant had developed bladder cancer during 
the last 50 years. It is urgent that surveys of cancer risk and 
programs in cancer control be initiated among workers and 
former workers in this area? 


1 Each county’s rate is located in two 
dimensions and, fürther, at least four 
numbers would be necessary to recon- 
struct the size and shapc of each county. 
This yields 7x 3,056 entries in a data 
matrix sufficient to reproduce a map. 


In highest decile, 
statistically significant 


Significantly high, but 
not in highest decile 


In highest decile, but not 
statistically significant 


Not significantly different 
from U.S. as a whole 


Significantly lower than 
U.S. as a whole 


UEFE 


2Robert Hoover, Thomas J. Mason, 
Frank W. McKay, and Joseph F. Frau- 
meni, Jr., “Cancer by County: New 
Resource for Etiologic Clues,” Science, 
189 (September 19, 1975), 1006. 


Maps from Atlas of Cancer Mortality for 
U.S. Counties: 1950-1969, by Thomas J. 
Mason, Frank W. McKay, Robert 
Hoover, William J. Blot, and Joseph F. 
Fraumeni, Jr. (Washington, D.C.; Public 
Health Service, National Institutes of 
Health, 1975). The six maps shown here 
were redesigned and redrawn by 
Lawrence Fahey and Edward Tufte. 


ый 


LUIS 
Lr | 


Ы 
( 
= 


rH 
i yon tz 
eters 


NES 
is 
nv 
LIT 
r? 


b 
Wy лыга 
Vd RS E 
RAR 
All types of cancer, white females; 
age-adjusted rate by county, 1950-1969 


All types of cancer, white males; 
age-adjusted rate by county, 1950-1969 


з= 


Trachea, bronchus, and lung cancer; 
white females; age-adjusted rate 
by county, 1950-1969 


=> 


Trachea, bronchus, and lung cancer; 
white males; age-adjusted rate 
by county, 1950-1969 


TR 


BOSE Sot 
eR ra 
Фу x ууу A 
Kk tob 
D 


Stomach cancer, white females; 
age-adjusted rate by county, 1950-1969 


VIS 


Е ў Г J E : |] ?) 


Stomach cancer, white males; 
age-adjusted rate by county, 1950-1969 


20 GRAPHICAL PRACTICE 


The maps repay careful study. Notice how quickly and naturally 
our attention has been directed toward exploring the substantive 
content of the data rather than toward questions of methodology 
and technique. Nonetheless the maps do have their flaws. They 
wrongly equate the visual importance of each county with its 
geographic area rather than with the number of people living in 
the county (or the number of cancer deaths). Our visual impres- 
sion of the data is entangled with the circumstance of geographic 
boundaries, shapes, and areas—the chronic problem afflicting shaded- 
in-area designs of such “blot maps" or "patch maps.” 

A further shortcoming, a defect of data rather than graphical 
composition, is that the maps are founded on a suspect data source, 
death certificate reports on the cause of death. These reports fall 
under the influence of diagnostic fashions prevailing among doc- 
tors and coroners in particular places and times, a troublesome 
adulterant of the evidence purporting to describe the already some- 
times ambiguous matter of the exact bodily site of the primary 
cancer. Thus part of the regional clustering seen on the maps, as 
well as some of the hot spots, may reflect varying diagnostic 
customs and fads along with the actual differences in cancer rates 
between areas. 


Data maps have a curious history. It was not until the seventeenth 
century that the combination of cartographic and statistical skills 
required to construct the data map came together, fully 5,000 years 
after the first geographic maps were drawn on clay tablets. And 
many highly sophisticated geographic maps were produced cen- 
turies before the first map containing any statistical material was 
drawn. For example, a detailed map with a full grid was engraved 
during the eleventh century А.р. in China, The Үй Chi Thu (Map 
of the Tracks of Yii the Great) shown here is described by Joseph 
Needham as the 


... most remarkable cartographic work of its age in any 
culture, carved in stone in +1137 but probably dating from 
before +1100. The scale of the grid is 100 [i to the.division. 
The coastal outline is relatively firm and the precision of the 
network of river systems extraordinary. The size of the original, 
which is now in the Pei Lin Museum at Sian, is about 3 fect 
square. The name of the geographer is not known. . . . Anyone 
who compares this map with the contemporary productions 
of European religious cosmography cannot but be amazed at 
the extent to which Chinese geography was at that time ahead 
of the West. . . . There was nothing like it in Europe till the 
Escorial MS. map of about +1550... .4 


3Data maps are usually described as 
“thematic maps” in cartography. For a 
thorough account, see Arthur H. Rob- 
inson, Early Thematic Mapping in the 
History of Cartography (Chicago, 1982). 
On the history of statistical graphics, see 
Н. Gray Funkhouser, “Historical Devel- 
opment of the Graphical Representation 
of Statistical Data," Osiris, 3 (November 
1937), 269-404; and James R. Beniger 
and Dorothy L. Robyn, "Quantitative 
Graphics in Statistics: A Brief History," 
American Statistician, 32 (February 1978), 
1-11. 


4Joseph Needham, Science and Civilisa- 
tion in China (Cambridge, 1959), vol. 3, 
546-547. 


GRAPHICAL EXCELLENCE 21 


71 x ай 
| Hot 8 6 & 
E £2. bm 
м 1 apu E 
[a = ere та t 
| L4 И “eo a * Fey LLL = 
i lot | | |  1$.. Lin ta 
E] m СЫ 
1 Ie + | 
INE! 
ГЁ 
|. 
ака) > 
i| v qm | 
4 p aha 17 dur А 
a Е ш 4 
iN 
T E] 
: ШЕ z z ING 
[| | Ё ж % * WP at 
Р Шу 4 aera fae Е aJi ‚ЛШ 
Ls Е. a Д 4 H'ACHBH 
[* 117 е nuu g ۹ ا ا‎ 
4 n ay 
a | "EVE | m et x — 3 ua =~ 2 
4 n © ГИ 
ВН 1 5 + | л | ۴ 5 k (АЕ: 
E 5 i = 
x «i [| ii x 
| la $ ая J at | * 
Al & rj 
Lj E a 
a ал g Fy 
i E z * 3 4+ 
= a al А 
Ls a 
— ё * = 4! 
x m иш * 
H ы E D 
я 
{п ن‎ HETH 
H WE! Tt ы Р BN: | E t 
|. Ез па [1 
Li 1 t = ا چ‎ Ha 
_| » x NN м [tAn [ala BM 
HI | | BR LEN EB Tr à 
| H 4 EN NEN AL "T Е ара 
И а a 3E p 
4 1 a w| à b > 
m Li T (Plaats 
t В тај 
| O M ri a 
i l m 
2 TR H H & 
hi it Аа = 
EET E E 


E. Chavannes, “Les Deux Plus Anciens 
Spécimens de la Cartographie Chinoise,” 
Bulletin de l École Française de l’ Extrême 
Orient, 3 (1903), 1-35, Carte B. 


22 GRAPHICAL PRACTICE 


H9 Ecce formulam, vit um,atque 


ftructuram Tabularum Ptolomzi,cum quibufdam locis,in 
: ПЧ 
quibus ftudiofus Geographix fe (atis exercere poteft. 


SEPTENTRIO. 


pars fuperior. 
о 


Ed 


д 0 
ae N FE 
we Wo 8. 

3 ы 
Ге = f ER 

= E 
Os ^o 
on А 
о 


pars inferior, 
MERIDIES. 


The 1546 edition of Cosmographia by Petrus Apianus contained 
examples of map design that show how very close European car- 
tography by that time had come to achieving statistical graphicacy, 
even approaching the bivariate scatterplot. But, according to the 
historical record, no one had yet made the quantitative abstraction 
of placing a measured quantity on the map's surface at the inter- 
section of the two threads instead of the name of a city, let alone 
the more difficult abstraction of replacing latitude and longitude 
with some other dimensions, such as tíme and money. Indeed, it 
was not until 1786 that the first economic time-series was plotted. 


GRAPHICAL EXCELLENCE 23 


LEAN DIA: 


cova 


vith 


One of the first data maps was Edmond Halley’s 1686 chart 
showing trade winds and monsoons on a world map.5 The detailed 
section below shows the cartographic symbolization; with, as 
Halley wrote, “... the sharp end of each little stroak pointing out 
that part of the Horizon, from whence the wind continually comes; 
and where there are Monsoons the rows of stroaks run alternately 
backwards and forwards, by which means they are thicker [denser] 
than elsewhere." 


Madera c» Fi 


uo'p "pe 


/ у /\/ 


qe 


АИ 
DAN 


5Norman J. W. Thrower, "Edmond 
Halley as a Thematic Geo-Cartogra- 
pher," Annals of the Association of Amer- 
ican Geographers, $9 (December 1969), 
652-676. 


Edmond Halley, “Ап Historical Ac- 
count of the Trade Winds, and Mon- 
soons, Observable in the Seas Between 
and Near the Tropicks; With an At- 
tempt to Assign the Phisical Cause of 
Said Winds," Philosophical Transactions, 
183 (1686), 153-168. 


24 GRAPHICAL PRACTICE 


An early and most worthy use of a map to chart patterns of 
disease was the famous dot map of Dr. John Snow, who plotted 
the location of deaths from cholera in central London for Sep- 
tember 1854. Deaths were marked by dots and, in addition, the 
area's eleven water pumps were located by crosses. Examining the 
scatter over the surface of the map, Snow observed that cholera 
occurred almost entirely among those who lived near (and drank 
from) the Broad Street water pump. He had the handle of the 
contaminated pump removed, ending the neighborhood epidemic 
which had taken more than soo lives. The pump is located at the 
center of the map, just to the right of the D in BROAD STREET. Of 
course the link between the pump and the disease might have been 
revealed by computation and analysis without graphics, with some 
good luck and hard work. But, here at least, graphical analysis 
testifies about the data far more efficiently than calculation. 


Yards 
50 100 


50 о 50 200 
— 


» Deaths From cholera 


Charles Joseph Minard gave quantity as well as direction to the 
data measures located on the world map in his portrayal of the 
1864 exports of French wine: 


E. W. Gilbert, “Pioneer Maps of Health 
and Disease in England,” Geographical 
Journal, 124 (1958), 172-183. Shown here 
is a redrawing of John Snow’s map. For 

a reproduction and detailed analysis of the 
original map, see Edward Tufte, Visual 
Explanations: Images and Quantities, Evidence 
and Narrative (Cheshire, Connecticut, 
1997), Chapter 2. Ideally, see John Snow, 
On the Mode of Communication of Cholera 
(London, 1855). 


иеа ‘sapssneyD зә ѕиод sap 

s[euoneN AOF] әр enbotporqig sui Aq 
pry зол sty Jo оорлой e ‘6ggt-Srgt 
parui ‘FX ap ѕәацран8ь savo 42 sanbiyd 
-p47) xHtajq? p, 'p1eurjq ydasof sapreyD 


ч 


x 


HTTTTTTTTTIT uM j we 
х 85 « 


А, 


MM pc 


AN OVI TY 


iem ормонй ext эши euve Hoover pie итә ssp; ex vog ws eet ea) rome чуп эр очир | 

иер ros эл un e 7 E ази elg n foris ie 
96" args ae v m2 әр perd сурб pp чно ink erquepog eco vy аруу ve home 
pe pg i E J mane ef v] ag ame wo exe yorga wo cortado engl эу ere prob et etie 
age oie oz ж эчен m g voro kd aerial tate apis cciam cie Куша 


"apt ш cernes, ) vo eque gere jou) pode? * pip 2 jp» Цу? tod 20) | 
yagı одон avd copodao ewhorre eut ° seaman’ eg тонлар o sonde ama 


26 GRAPHICAL PRACTICE 


Computerized cartography and modern photographic techniques 
have increased the density of information some 5,000-fold in the 
best of current data maps compared to Halley's pioneering effort. 
This map shows the distribution of 1.3 million galaxies (including 
some overlaps) in the northern galactic hemisphere. The map 
divides the sky into 1,024 X 2,222 rectangles. The number of gal- 
axies counted in each of the 2,275,328 rectangles is represented by 
ten gray tones; the darker the tone, the greater the number of 
galaxies counted. The north galactic pole is at the center. The 
sharp edge on the left results from the earth blocking the view 
from the observatory. In the area near the perimeter of the map, 
the view is obscured by the interstellar dust of the galaxy in which 
we live (the Milky Way) as the line of sight passes through the 
flattened disk of our galaxy. The curious texture of local clusters 
of galaxies seen in this truly new view of the universe was not 
anticipated by students of galaxies, who had, of course, micro- 
scopically examined millions of photographs of galaxies before 
seeing this macroscopic view. Although the clusters are clearly 
evident (and accounted for by a theory of galactic origins), the 


seemingly random filaments may be happenstance. The producers 
7 Michael Seldner, B. H. Siebers, Edward 


of the map note the "strong temptation to conclude that the gal- J. Groth and P. James E. Peebles “New 

axies are arranged in a remarkable filamentary pattern on scales Reduction of the Lick Catalog of 

of approximately 5° to 15°, but we caution that this visual impres- Galaxies,” Astronomical Journal, 82 (April 
А Ж : : Д 1977), 249-314. See Gillian К. Knapp, 

sion may be misleading because the eye tends to pick out linear "Mining the Heavens: The Sloan Digital 

patterns even in random noise. Indeed, roughly similar patterns Sky Survey,” Sky & Telescope (August 


1997), 40-48; Margaret J. Geller and John 
: EM, : P. Huchra, ^Mapping the Universe," 
linear structure has been built in. . . ."7 Sky & Telescope (August 1991), 134-139. 


are seen on maps constructed from simulated catalogs where no 


The most extensive data maps, such as the cancer atlas and the 
count of the galaxies, place millions of bits of information on a 
single page before our eyes. No other method for the display of 


statistical information is so powerful. 


GRAPHICAL EXCELLENCE 27 


Fe ee asa a e a Baa ar e ra i a Tac a i a as a ar ooa a baa a ca La sacrata raa la лыд... 


28 GRAPHICAL PRACTICE 


Time-Series 


The time-series plot is the most frequently used form of graphic ЗА random sample of 4,000 graphics 
design.? With one dimension marching along to the regular rhythm dia wo (omis of the world's news- 

: : papers and magazines published from 
of seconds, minutes, hours, days, weeks, months, years, centuries, or 1974 to 1980 found that more than 75 
millennia, the natural ordering of the time scale gives this design a percent of all the graphics published 
t 1 demic fi Hie f AG th b were time-series. Chapter 3 reports more 
strength and efficiency of interpretation found in no other graphic oui 
arrangement. 


This reputed tenth- (or possibly eleventh-) century illustration 
of the inclinations of the planetary orbits as a function of time, 
apparently part of a text for monastery schools, is the oldest known 
example of an attempt to show changing values graphically. It 
appears as a mysterious and isolated wonder in the history of data 
graphics, since the next extant graphic of a plotted time-series 
shows up some 800 years later. According to Funkhouser, the 
astronomical content is confused and there are difficulties in recon- 
ciling the graph and its accompanying text with the actual move- 
ments of the planets. Particularly disconcerting is the wavy path SH. Gray Funkhouser, “А Note оп 
ascribed to the sun. An erasure and correction of a curve occur a Tenth Century Graph,” Osiris, 1 


near the middle of the graph. (January 1936), 260-262. 


(Rad 


TAN 
АЕТ 
SS 


ce 
XA 
V | 


Y 
és 
m 

EDU 
) y 


PN 
jaa 


E 
A | 
N 
VAS 
LAN 
SA 


N 


Sek TTT HUNE 


E LX LA аа | 


ra 

ar 
Z 
w 
LAN 
oe 
EIN 


S 

Bi 
ЕЕ 
E 
ж 
ы 
= 

zs 

E 
F3 
E 
we 
E 
ER 
ERIS 
E: 


GRAPHICAL EXCELLENCE 29 


It was not until the late 1700s that time-series charts began to 
appear in scientific writings. This drawing of Johann Heinrich 
Lambert, one of a long series, shows the periodic variation in soil 
temperature in relation to the depth under the surface. The greater 
the depth, the greater the time-lag in temperature responsiveness. 
Modern graphic designs showing time-series periodicities differ 
little from those of Lambert, although the data bases are far larger. 


J. H. Lambert, Pyrometrie (Berlin, 1779). 


4 NORTHERN SOURCE VOYAGER 2 SOUTHERN SOURCE 
"E io d 1 L 1 1 П П | П 1 1 1 L П ед 1 f 1 1 А П f | i | 
> 56.2 kHz / РА = 
E М X 
ES) ilm 
[ти КНЕ 
g 6 
Ê 6 
9 17.8 kHz 
И j 
B ge: - 
20° 
RO CUAL Oe Е ЕА en m um Д А 
" < 7 =: 
E ٠ پد‎ E улу у x ee УО) VJ Г 
e 8 MAGNE TIC JOVIGRAPHIC L 
-20° тү ү EE Re r 
RAR) 106.5 951 83.4 71.5 592 46.5 531 9.5 
JULY 2 JULY 3 JULY 4 JULY 5 JULY 6 JULY 7 JULY 8, 1979 


This plot of radio emissions from Jupiter is based on data collected 
by Voyager 2 in its pass close by the planet in July 1979. The radio 
intensity increases and decreases in a ten-hour cycle as Jupiter 
rotates. Maximum intensity occurs when the Jovian north mag- 
netic pole is tipped toward the spacecraft, indicating a northern 
hemisphere source. A southern source was detected on July 7, as 
the spacecraft neared the equatorial plane. The horizontal scale 
shows the distance of the spacecraft from the planet measured in 
terms of Jupiter radii (R). Note the use of dual labels on the hori- 
zontal to indicate both the date and distance from Jupiter. The 
entire bottom panel also serves to label the horizontal scale, 
describing the changing orientation of the spacecraft relative to 
Jupiter as the planet is approached. The multiple time-series 
enforce not only comparisons within each series over time (as do 


all time-series plots) but also comparisons between the three Dii Gutnett, W: S- Kurthiand E- L. 


Scarf, “Plasma Wave Observations Near 


different sampled radio bands shown. This richly multivariate Jupiter: Initial Results from Voyager 

display is based on 453,600 instrument samples of eight bits each. 2," Science 206 (November 23, 1979), 

Th ale 6 milli bi d db Fand 987-991; and letter from Donald A. 
e resulting 3.6 million bits were reduced by peak and average Gurnett to Edward R. Tufte, Jane 27, 


processing to the 18,900 points actually plotted on the graphic. 1980. 


30 GRAPHICAL PRACTICE 


Time-series displays are at their best for big data sets with real 
variability. Why waste the power of data graphics on simple lin- 
ear changes, 


which can usually be better summarized in one or two numbers? 
Instead, graphics should be reserved for the richer, more complex, 
more difficult statistical material. This New York City weather 
summary for 1980 depicts 1,888 numbers. The daily high and 
low temperatures are shown in relation to the long-run average. 
The path of the normal temperatures also provides a forecast of 
expected change over the year; in the middle of February, for 
instance, New York City residents can look forward to warming 
at the rate of about 1.5 degrees per week all the way to July, 

the yearly peak. This distinguished graphic successfully organizes 
a large collection of numbers, makes comparisons between different 
parts of the data, and tells a story. 


NEW YORK CITY’S WEATHER FOR 1980 


EURUARY Ici t JULY AUGUST | SEPTEMBER OCTOBER NOVEMBER DECEMBER 


ANNUAL TEMPERATURE 
75 


LINE INDICATES L—* 


NORMAL HIGH 
Wai 
"IY 


"t ПОЙ DEC. 25: -1 


PRECIPITATION IN nes 
| Total Precipit 


| 
ACTUAL NORMAL | ACTUAL NORMAL | ACTUAL NORMAL | ACTUAL NORMAL [ACTUAL NORMAL | ACTUAL NORMAL , ACTUACNORMAL | ACTUAL NORMAL 


9 س 
AGTUAL NORMAL |‏ 


New York Tímes, January 11, 1981, p. 32. 


GRAPHICAL EXCELLENCE 31 


и MINUIT 1 2 5 * 5 в 
Т 


"ML 


PARIS... 


Moret 
MONTEREAU 


ji 
Laroche | 


TONNERRE 


p BE - N. Ц. 


Leslaumes 


DIJON 


Chagry 
halons s- Cine lag 


MACON 


n 


dfGermadrv aw 
ME dor 


LYON Perrache - LLLA t 
(MK ө 10 и мит 1 2 3 е 5 6 


E. J. Marey, La méthode graphique (Paris, 
1885), p. 20. The method is attributed 
to the French engineer, Ibry. 


A design with similar strengths is Marey’s graphical train sched- atl IP | | | il 
ule for Paris to Lyon in the 1880s. Arrivals and departures from a ПИТЕР 
station are located along the horizontal; length of stop at a station | 
is indicated by the length of the horizontal line. The stations are | A 
separated in proportion to their actual distance apart. The slope 
of the line reflects the speed of the train: the more nearly vertical 
the line, the faster the train. The intersection of two lines locates | | 
the time and place that trains going in opposite directions pass ТШТ N 


each other. 


и: 
| 


In 1981 а new express train from Paris to Lyon cut the trip to 
under three hours, compared to more than nine hours when Marey 
published the graphical train schedule. The path of the modern NIA «f 


TGV (train à grande vitesse) is shown, overlaid on the schedule of tH 


100 years before: h | 


32 GRAPHICAL PRACTICE 


Tbe two great inventors of modern graphical designs were J. H. 
Lambert (1728-1777), a Swiss-German scientist and mathematician, 
and William Playfair (1759-1823), a Scottish political economist.!9 
The first known time-series using economic data was published in 
Playfair's remarkable book, The Commercial and Political Atlas (Lon- 
don, 1786). Note the graphical arithmetic, which shows the shift- 
ing balance of trade by the difference between the import and 
export time-series. Playfair contrasted his new graphical method 
with the tabular presentation of data: 


Information, that is imperfectly acquired, is generally as imper- 
fectly retained; and a man who has carefully investigated a 
printed table, finds, when done, that he has only a very faint 
and partial idea of what he has read; and that like a figure 
imprinted on sand, is soon totally erased and defaced. The 
amount of mercantile transactions in money, and of profit or 
loss, are capable of being as easily represented in drawing, as 
any part of space, or as the face of a country; though, till now, 
it has not been attempted. Upon that Sec these Charts 
were made; and, while they give a simpl e and distinct idea, 
they are as near perfect accuracy as is any way useful. On 
inspecting any one of these Charts attentively, a sufficiently 
distinct i impression will be made, to remain unimpaired for a 
considerable time, and the idea which does remain will be 
simple and complete, at once including the duration and the 
amount. [pages 3-4] 


For Playfair, graphics were preferable to tables because graphics 
showed the shape of the data in a comparative perspective. Time- 


10Layra Tilling, “Early Experimental 
Graphs,” British Journal for the History 
of Science, 8 (1975), 193-213. 


CHART e^ ad the IMPORTS and EXPORTS » and fom ENGLAND 


From the Year 100 do уже by # Playfur 


ae oon bec | LE LLL 


=: E EEE Е 
= Е - 5 мы! [BS 
= m ше 4, Е in a 
Ц 
mnt ur 11 ПИШИШ. = 
b E == == ت ی چ و‎ = zi 7 


Lhe Рудог тё the Potton, capere û YEARS £ Bare on he HERE hand MILLIONS J POUNDS 
«бо Sag? Lublokit as the sted dires, 20.8 Ang" 77848 


GRAPHICAL EXCELLENCE 


series plots did this, and all but one of the 44 charts in the first 
edition of The Commercial and Political Atlas were time-series. That 
one exception is the first known bar chart, which Playfair invented 
because year-to-year data were missing and he needed a design to 
portray the one-year data that were available. Nonetheless he was 
skeptical about his innovation: 


This Chart is different from the others in principle, as it does 
not comprehend any portion of time, and it is much inferior 
in utility to those that do; for though it gives the extent of the 
different branches of trade, it does not compare the same 
branch of commerce with itself at different periods; nor does 
it imprint upon the mind that distinct idea, in doing which, 
the chief advantage of Charts consists: for as it wants the di- 
mension that is formed by duration, there is no shape given 
to the quantities. [page 101] 


He was right: small, noncomparative, highly labeled data sets 
usually belong in tables. 


Exports amd. Imports of SCOTLAND © and from different parts for one Year from Chriftmar 1780 to Chriftmas аба. 


li 


BER 


4- 


TED 


Meli سد‎ doat مین‎ Kona 75 pro by WU Pli Hel лам? 33a ramit. Lender 


The chart does show, at any rate, the imports (cross-hatched 
lines) and exports (solid lines) to and from Scotland in 1781 for 
17 countries, which are ordered by volume of trade. The horizontal 
scale is at the top, possibly to make it more convenient to see in 
plotting the points by hand. Zero values are nicely indicated both 
by the absence of a bar and by a “о” The horizontal scale mis- 
takenly repeats “200.” In nearly all his charts, Playfair placed the 
labels for the vertical scale on the right side of the page (suggest- 
ing that he plotted the data points using his left hand). 


33 


34 GRAPHICAL PRACTICE 


Playfair's last book addressed the question whether the price of 
wheat had increased relative to wages. In his Letter on our agricul- 
tural distresses, their causes and remedies; accompanied with tables and 
copper-plate charts shewing and comparing the prices of wheat, bread and 
labour, from 1565 to 1821, Playfair wrote: 


You have before you, my Lords and Gentlemen, a chart of 
the prices of wheat for 250 years, made from official returns; 
on the same plate I have traced a line representing, as nearly as 
I can, the wages of good mechanics, such as smiths, masons, 
and carpenters, in order to compare the proportion between 
them and the price of wheat at every different period. . . . the 
main fact deserving of consideration is, that never at any former 
period was wheat so cheap, in proportion to mechanical labour, 
as it is at the present time. . . . [pages 29-31] 


Here Playfair plotted three parallel time-series: prices, wages, and 
the reigns of British kings and queens. 


(CC CHART; БЕ 
Sewing at One Kew 
Де tee of The, g Quarterv; Ment, 
E Wager Dc by the Week , 
he You. (6 awal, 
eq M PLAYERA 


i- 


& & 


Price of Mo Quarter 


a 7 : ; " Hass E PSs | 
КҮП ERE BU НЕН НҮН ы 
үгү ЕЕН ИНЕ SRE iE ааваа 18i 


к a^ mn нта 4 H M M 
" TUM Кы 2 don d inn 


The history and genealogy of royalty was long a graphical favorite. 
This superb construction of E. J. Marey brings together several 
sets of facts about English rulers into a time-series that conveys a 
sense of the march of history. Marey (1830-1904) also pioneered 
the development of graphical methods in human and animal 
physiology, including studies of horses moving at different paces, 


Wheat ix ЖЫ 


0981 


0581 


04 


GRAPHICAL EXCELLENCE 3$ 


oe 


02 


еп 
a 


ri 


i 


Ф 
i 
OL © 
2 
oO 
0081 
Ф 
© 
06 v 
M 
08 EE 
Ф 
[73 о 
$ 
© 
09 > E 
P “o 
Ф a 
0921 г 2 
S o 
n © 
04 2 
= = 
4 
5 
oe eq E 
3 t 
oz | 2 
ED 
[4 
OL e 
0011 
06 
= = 
08 БЕ " 
1 ШЕ. 
OL 
5 ‹5\ 
Ф 
< 


0991 


thode Graphique (Paris, 


6. 


p. 
the time-series are from pages 191, 224, 


Beginning with the tracks of the horse, 
222, 265, 60, and 61. 


E. J. Marey, Movement (London, 1895). 


E. J. Marey, La Мё 


1885), 


Gallop. 


Jog-trot. 


Amble. 


Quick 
walk, 


Walk 
(long stride), 


36 GRAPHICAL PRACTICE 


the movement of a starfish turning itself over (read images from 
the bottom upwards), 


as well as the advance of the gecko. 


Marey's man in black velvet, photographed in stick-figure images, 
became the time-series forerunner of Marcel Duchamp's Nude 
Descending a Staircase. 


QE eee ee ode oq ee ee eee ee 


CM A 


WALLS 


The problem with time-series is that the simple passage of time 
is not a good explanatory variable: descriptive chronology is not 
causal explanation. There are occasional exceptions, especially when 
there is a clear mechanism that drives the Y-variable. This time- 
series does testify about causality: the outgoing mail of the U.S. 
House of Representatives peaks every two years, just before the 


election day: 


60- Monthly outgoing 
mail workload, 
millions of units 


October 1968 


40 - 


20- 


1968 


1967 


1969 


October 1970 


1970 1971 


October 1972 | 


1972 


The graphic is worth at least 700 words, the number used in a 
news report describing how incumbent representatives exploit their 
free mailing privileges to advance their re-election campaigns: 


FRANKED MAIL TIE 
TO VOTING SHOWN 


Testimony Finds the Volume 
Rises Before Elections 


WASHINGTON, June 1 (AP) 
—New court testimony and doc- 
wments show that much of 
the mail Congress sends at 
taxpayer expense is tied direct- 
ly to the re-election campaigns 
of Senate and House members. 


According to material ед|’ 


in a lawsuit in Federal Court: 
GSenate Republicans put two 
direct-mail experts on the pub- 
lic payroll to advise them on 
how to use their free mailing 
privileges to get votes. 
Gan election manual pre- 
pared for Senate Democrats | 
Tefers to newsletters as a "free 


for sending them as an integral 
part of a model re-election cam- 


жү. 
Senator John G. Tower, Re- 
publican of Texas, mailed тоге, 
than 800,000 special-intere 
letters at taxpayer expense as 
part of his 1972 re-election 
effort and received campaign 
volunteer offers and donations 
in response. 

{Senator Jacob К. Javits, Re- 

publican of New York, gave| 
written approval in 1973 for 
a tax-paid mail program intend- 
ed to better his image and 
pay off at the polls. He focused 
his mail on areas where he! 
needed votes. 
The volume of “official” 
Congressional mall rises in 
election years and peaks just 
before the general election. 

None of this activity neces- 
sarily violates any law or regula- 
Чоп, since Congress has wide 
discretion in the use of tax-paid 
mail, Congress gave itself the 
right to send official mail at 


forum,” and sets up a timetable 


Government expense at the 


founding of the republic, and 
only Congress polices against 
abuses of the free mailings. 

Complaints of political use 
of the free-mailing privilege, 


called the franking privilege, 


are heard every election year. 
Recently, however, the volume 
and cost of franked mail has 
multiplied. A new Federal law 
will limit what out-of-office 
challengers can spend to unseat 
incumbents. 

In 1972, Congress passed a 
law prohibiting mass franked 
mailings within 28 days before 
&n election. The sponsor of 
that fegislation, Representative 


Morris K. Udall, Democrat of; 


Arizona, said in an interview 
that further changes were need- 
ed to curtail political abuse 
of the frank. 

Mr. Udall urged a 60-day 
pre-election cutoff for mass 
mailings and said he favored 
closing a loophole that recently 
allowed defeated Representa- 
tive Frank M. Clark, Democrat 
of Pennsylvania, to send a 


Mranked newsletter to bis old 


office. Mr. Clark is seeking 
to regain his old post. 
Practice Documented 


Seldom has the political 
use of franked mail been so 
well documented as in recent 
testimony and documents filed 
in a Federal Court by Common 
Cause, the lobby group, which. 
115 suing for an end to tax-fi- 
nanced mass mallings by Con- 
gress. 

For example, Joyce P. Baker, 
& political mail specialist, said 
in a 1973 job proposal that 
ishe wanted to set up direct- 
ппай programs for Republican 
]Senators using franked mail. 
| “The purpose of such a pro- 
gram is to help an incumbent 
Senator get re-elected,” she 
said. 

She was put on the Senate 
Payroll at $18,810 a year in 

973 and 1974 and testified 
that during that time she aided 
Republican Senators Robert J. 


| 


constituents after he had left; 


GRAPHICAL 


nick of Colorado, Charles McC. 
I Mathias Jr. of Maryland 

| Another po!tttcal тай specíai- 
jist, Lee W. MacGregor, wrote 
la proposal for the use of 
jfranked mail by his chief, Sena- 
tor Javits, in 1973. 

“The over-all objective of the 
franked mail m can be 
to get the recipient of the mail 
to identify positively with a 


ог а bill you have introduced; 
Ithe kind of identification that 
(сап be translated into a vote 
lat the polls on election day” 
i Mr. MacGregor said. 


reached. His administrative 
assistant, Donald Kellerman, 
defended the use of franked 
imail. 

| “It is a standard device to 
let voters, not voters but citiz- 
ens, know what the Senator 
is doing here in Washington, 
ihe said. 


Dole of Kansas, Peter Н. Domi-| 


particular stand you have taken. 


Mr. Javits was out of the; 
country and could not be 


EXCELLENCE 37 


Senator Towers use of 
ifranked mail in his 1972 cam- 
paign was documented by mem- 
orandums. 3 

i Tom Loeffler, a high-ranking 
[campaign aide, wrote in a mem- 
jorandum dated Oct, 27, 
1972, that during the campai 
Senator Tower had sent “31 
Special interest letters erating 
;approximately 803,333 franke 
mailings." 

Mr. Tower was not available 
{for comment. His administra- 
itive assistant, Elwin Skiles, 
said the Senators use of 
franked mail in 1972 was with- 
lin the law, and he defended 
the free-mailing privileges. 
Postal Service figures show 
(that іп the 12 months before 
'November, 1973, Congress sent 
.222.9 million franked pieces 
;of mail But in the next 12 
imonths, covering the election 
‘season of 1974, Congress sent 
350.6 million, a jump of 57 
[рег cent about what's happen- 
fing,” Mr. Skiles said. 


38 GRAPHICAL PRACTICE 


Time-series plots can be moved toward causal explanation by 
smuggling additional variables into the graphic design. For example, 
this decomposition of economic data, arraying 1,296 numbers, 
breaks out the top series into seasonal and trading-day fluctuations 
(which dominate short-term changes) to reveal the long-run trend 
adjusted for inflation, (Note a significant defect in the design, 
however: the vertical grid conceals the height of the December 
peaks.) The next step would be to bring in additional variables to 
explain the transformed and improved series at the bottom. 


Systematic and Irregular Components of Total Retoil Sales, United Stotes gute 
Billions of dollars 


Unadjusted data E 24 


|| Percent 


Holiday variation 105 


Percent 
: Trading day variation 1 85 


95 Seasonal variation Percent 


{ 120 
| | А 110 
100 
90 


105 Irregular component Billions 
of 


Percent 


dollars 
| 
32 
36 
32 


36 
}, 


Billions 


Trend-cycle component 


MCD curve a 
Seasonally adjusted M 


of 
dollars 


32 


series 


Deflated Retail Sales 


Pree a 


1960 1881 19882 1563 1984 195 19868 1957 іза 1969 1970 1971 


11 See William S. Cleveland and Irma J. 
Terpenning, “Graphical Methods for 
Seasonal Adjustment," Journal of the 
American Statistical Association 77 (March 
1982), 52-62. 


Julius Shiskin, “Measuring Current Eco- 
nomic Fluctuations," Statistical Reporter 
(July 1973), р. 3. 


Finally, a vivid design (with appropriate data) is the before-after 
time-series: 


Time of day (PST) 


3 8 9 12 15 18 21 24 
[7T T T 

5 al 
e L 
x H 
Е 
= L 

-2 р ! 

UN transfer 

-4 | February 14, 1982 

A monopole? 


Cabrera's candidate monopole signal looms over a disturbance caused by a liquid nitrogen 
transfer earlier in the day. The jump in magnetic flux through the superconducting detector 
loop (or equivalently, the jump in the loop's supercurrent) is just the right magnitude to be a 
monopole. Moreover, the current remained stable for many hours afterward. 


And before and after the collapse of a bridge on the Rhóne in 1840: 


Pont de Bourg- St Andéol sur le Rhone. 


ТОЛЕ ЛАМА 


GRAPHICAL EXCELLENCE 39 


M. Mitchell Waldrop, “In Search 
of the Magnetic Monopole,” Science 
(June 4, 1982), p. 1087. 


Charles Joseph Minard, “De la Chute 
des Ponts dans les grandes Crues," 
(October 24, 1856), Figure 3, in Minard, 
Collection de ses brochures (Paris, 1821— 
1869), held by the Bibliothèque de 

Y Ecole Nationale des Ponts et Chaussées, 
Paris. 


40 GRAPHICAL PRACTICE 


Narrative Graphics of Space and Time 


An especially effective device for enhancing the explanatory power 
of time-series displays is to add spatial dimensions to the design of 
the graphic, so that the data are moving over space (in two or three 
dimensions) as well as over time. Three excellent space-time-story 
graphics illustrate here how multivariate complexity can be subtly 
integrated into graphical architecture, integrated so gently and 
unobtrusively that viewers are hardly aware that they are looking 
into a world of four or five dimensions. Occasionally graphics are 
belligerently multivariate, advertising the technique rather than 
the data. But not these three. 


The first is the classic of Charles Joseph Minard (1781-1870), the 
French engineer, which shows the texrible fate of Napoleon's army 
in Russia. Described by E. J. Marey as seeming to defy the pen of 
the historian by its brutal eloquence,’? this combination of data map 
and time-series, drawn in 1869, portrays a sequence of devastating losses 
suffered in Napoleon’s Russian campaign of 1812. Beginning at left 
on the Polish-Russian border near the Niemen River, the thick tan 
flow-line shows the size of the Grand Army (422,000) as it invaded 
Russia in June 1812. The width of this band indicates the size of the 
army at each place on the map. In September, the army reached 
Moscow, which was by then sacked and deserted, with 100,000 
men. The path of Napoleon’s retreat from Moscow is depicted by 
the darker, lower band, which is linked to a temperature scale and 
dates at the bottom of the chart. It was a bitterly cold winter, and 
many froze on the march out of Russia. As the graphic shows, the 
crossing of the Berezina River was a disaster, and the army finally 
struggled back into Poland with only 10,000 men remaining. Also 
shown are the movements of auxiliary troops, as they sought to 
protect the rear and the flank of the advancing army. Minard’s 
graphic tells a rich, coherent story with its multivariate data, far 
more enlightening than just a single number bouncing along over 
time. Six variables are plotted: the size of the army, its location 

on a two-dimensional surface, direction of the army’s movement, 
and temperature on various dates during the retreat from Moscow. 
At upper right we see Minard’s French original, which was printed 
as a two-color lithograph in the form of a small poster. And at 
lower right, our English translation. 


It may well be the best statistical graphic ever drawn. 


12 Е. J. Marey, La méthode graphique 
(Paris, 1885), p. 73. For more on Minard, 
see Arthur H. Robinson, “The Thematic 
Maps of Charles Joseph Minard,” Imago 
Mundi, 21 (1967), 95-108. 


Upper image from Charles Joseph Minard, 
Tableaux Graphiques et Cartes Figuratives de 
M. Minard, 1845-1869, Bibliothéque de 
l'École Nationale des Ponts et Chaussées, 
Paris, item 28 (62 by 25 cm, or 25 by 10 in). 
English translation by Dawn Finley and 
redrawing by Elaine Morse, completed 
August 2002. 


Carle Figurative desperte висина bonnes deh’ Kemis casui daus da Campagne a Rusdie. A812 1813. 


acoso рак M. лид, Vuopectiur Ciniral дез Fonts i Classes са. teltat, 
? ЕСМ Р Mona 1869. 
Les nombres a homes priset ans aepeissutds pardeodongenro ds Jones celosien nato J'en millones ponto дигэн. pommes ; ilo font de plus дей on авес) 


аео once. оча, disigneddes bommas.quireationon „Азу ee moin concep mde fes acusciquement® qui oni teni & drese da carte. ome Ab puisis 
эы ba cuotas де. MM. Chiers, dehur; dec Fetendac; de Chambray а de yewmal ндаи. de, Jacob; parwacion-deLclemte дарав -28 Octobre 


Foe oscuras juger l'oeil te diminution гае avid aisuppos que-leo comps ta Prince Jécme-cr 2 Marichal Davouosaui авай. détachis accent 


me ЛС ы nmi naja rotto Cocha crm WITH, avait toujonce чада avec karmien 


Н 


Lawes eae de Franca (tate te ЖА. Аел) 


St е — —3 


Mohilow 


de 2—00. 


do Айн уейв. 


Les Crags patent ва golep 


of e 14 9 


— 3267 X7 


eng. pie Йе мү! Tos SY Marie $2095 à Гол. 


Утте буле агалы | 


of the War cue iu nen of the Shenk Areny iv the cA assi Campaign „14812 A813. 
& AIL. Ласа, Ju acral of Bridges and Lond in- retirement: 
d берлен Ga аз PP fas, Moveubes 20,1869. 


Thro абаз of wen. present ace roprescuted by the witho af tbe cor gones at apate of one иі аы for eg tow Bonsand mew; they are further written. acron 
the зоною, The i desiquates the men who eni into Knssia, the black there whe Lawe itt Ди, information wbich bas атынд 1 araw up the map bas бен, aleten 
fromthe werks of IN. Chiers, of elgur; of Fezendac, of Chambray asd the unpublished dingy of Jacob; khe. phoxmasiat of the. emg ince. Oetobee 281b. 
Әл, edes A6 Pettor jug wilh the eye te diminutionof the nimy, D have assume that the tronpa of Фаме Англи of Marshal Davoust—whe bad bron detached at Minot 
aud Жен» and Bac rejoined around Cecha амд Vitor, бад always marched with e атеш. 


Sigurative Maye 


Pawn p 


: Cavern бушу of France (aap vf Sense) 
| È | وویم‎ 
Moghilev | 
= SONIS — | E: — 
| CRAPHIC TABLE з of the Réaumur thermometer below zero. 
mi Е Ваш 
The Cossucks pase the froun a - ~ — и -— October 24 өз м 
Hiiren at ш gallop. ч чиш T SN 9 
а — "тте ج‎ — —-— 4з аз эз 
== S20" Novemher 28. — 21 November 14 
— 282 ени. Е ember Å — ——— - — e -36 
— 30 December б 
utes. per Regnier, F Por 5% aria T3088 i Parit. Tap Lih Regnier n Dens det 


42 GRAPHICAL PRACTICE 


The next time-space graphic, drawn by a computer, displays Los Angeles Times, July 22, 1979; based 
the levels of three air pollutants located over a two-dimensional о dan Sae 
surface (six counties in southern California) at four times during 
the day. Nitrogen oxides (top row) are emitted by power plants, 
refineries, and vehicles. Refineries along the coast and Kaiser Steel’s 
Fontana plant produce the post-midnight peaks shown in tbe first 
panel; traffic and power plants (with their heavy daytime demand) 
send levels up during the day. Carbon monoxide (second row) is 
low after midnight except out at the steel plant; morning traffic 
then begins to generate each day's ocean of carbon monoxide, 
with the greatest concentration at the convergence of five freeways 
in downtown Los Angeles. Reactive hydrocarbons (third row), 
like nitrogen oxides, come from refineries after midnight and then 
increase with traffic during the day. Each of the 12 time-space- 
pollutant slices summarizes pollutants for 2,400 spatial locations 
(2,400 squares five kilometers on a side). Thus 28,800 pollutant 
readings are shown, except for those masked by peaks. 

The air pollution display is a small multiple. The same graphical 
design structure is repeated for each of the twelve slices or multi- 
ples. Small multiples are economical: once viewers understand the 
design of one slice, they have immediate access to the data in all 
the other slices. Thus, as the eye moves from one slice to the next, 
the constancy of the design allows the viewer to focus on changes 
in the data rather than on changes in graphical design. 


CARBON MONOXIDE 


= 


REACTIVE 
HYDROCARBONS 


GRAPHICAL EXCELLENCE 4j 


Our third example of a space-time-story graphic ingeniously L. Hugh Newman, Man and Insects 
mixes space and time on the horizontal axis. This design moves (London, 1965), pp. 104-105. 
well beyond the conventional time-series because of its clever 
plotting field, with location relative to the ground surface on the 
vertical axis and time/space on the horizontal. The life cycle of 
the Japanese beetle is shown. 


January February | March April 


More Abstract Designs: Relational Graphics 


The invention of data graphics required replacing the latitude- 
longitude coordinates of the map with more abstract measures not 
based on geographical analogy. Moving from maps to statistical 
graphics was a big step, and thousands of years passed before this 
step was taken by Lambert, Playfair, and others in the eighteenth 
century. Even so, analogies to the physical world served as the 
conceptual basis for early time-series. Playfair repeatedly compared 
his charts to maps and, in the preface to the first edition of The 
Commercial and Political Atlas, argued that his charts corresponded 
to a physical realization of the data: 


Suppose the money we pay in any one year for the expence 
of the Navy were in guineas, and that these guineas were laid 
down upon a large table in a straight line, and touching each 
other, and those paid next year were laid down in another 


44 GRAPHICAL PRACTICE 


straight line, and the same continued for a number of years: *D* 
these lines would be of different lengths, as there were fewer 

or more guineas; and they would make a shape, the dimensions чн 

of which would agree exactly with the amount of the sums; STATISTICAL BREVIARY; 

and the value of a guinea would be represented by the part of — 


ON A PRINCIPLE ENTIRELY NEW, 


space which it covered. The Charts are exactly this upon a 


small scale, and one division represents the breadth or value of pides 


or EVERY 


ten thousand or an hundred thousand guineas as marked, with I6 di» apo IER: 
the same exactness that a square inch upon a map may represent ao ee 
a square mile of country. And they, therefore, are a represen- STAINED COPTEA LATE CHARTS, 
tation of the real money laid down in different lines, as it was subite 

PRYSICAL POWERS OF EACH DISTINCT NATION 


originally paid away. [pages iii-iv] WITH EASE AND PERSPICUITY, 


By WILLIAM PLAYFAIR. 


Fifteen years later in The Statistical Breviary, his most theoretical A GL 
book about graphics, Playfair broke free of analogies to the phys- ATIS нм o ia 


ical world and drew graphics as designs-in-themselves. 

One of four plates in The Statistical Breviary, this graphic is dis- 
tinguished by its multivariate data, the use of area to depict quan- 
tity, and the pie chart—in apparently the first application of these 
devices. The circle represents the area of each country; the line on 
the left, the population in millions read on the vertical scales; the 
line on the right, the revenue (taxes) collected in millions of pounds 
sterling read also on the vertical scale; and the "dotted lines drawn 
between the population and revenue, are merely intended to con- 
nect together the lines belonging to the same country. The ascent 


1801. 


rare ‚ сету DOS ED SRE 
| Сша Het rong en te Poynter fepe y Mee Dioses Nawioxs af Боко?» ond рг С анор Жек 


— European — 
Dominions 


Asiauc Dominions | 


E 
e PES 


Райдос pe" табе, L E 2 "ее "АШ 


— m ane 


GRAPHICAL EXCELLENCE 


of those lines being from right to left, or from left to right, shews 
whether in proportion to its population the country is burdened 
with heavy taxes or otherwise” (pages 13—14). The slope of the 
dotted line is uninformative, since it is dependent on the diameter 
of the circle as well as the height of the two verticals. However, 
the sign of the slope does make sense, taking Playfair to his familiar 
point about what he regarded as excessive taxation in Britain 

(sixth circle from the right, with the slope running opposite to 
most countries). Playfair was enthusiastic about the multivariate 
arrangement because it fostered comparisons: 


The author of this work applied the use of lines to matters of 
commerce and finance about fifteen years ago, with great 
success. His mode was generally approved of as not only facili- 
tating, but rendering those studies more clear, and retained 
more easily by the memory. The present charts are in like 
manner intended to aid statistical studies, by shewing to the 
eye the sizes of different countries represented by similar forms, 
for where forms are not similar, the eye cannot compare them 
easily nor accurately. From this circumstance it happens, that 
we have a more accurate idea of the sizes of the planets, which 
are spheres, than of the nations of Europe which we see on 
the maps, all of which are irregular forms in themselves as well 
as unlike to each other. Size, Population, and Revenue, are the 
three principal objects of attention upon the general scale of 
statistical studies, whether we are actuated by curiosity or 
interest; I have therefore represented these three objects in one 
view. ... [page 15] 


But here Playfair had a forerunner—and one who thought more 
clearly about the abstract problems of graphical design than did 
Playfair, who lacked mathematical skills. A most remarkable and 
explicit early theoretical statement advancing the general (non- 
analogical) relational graphic was made by J. H. Lambert in 1765, 
35 years before The Statistical Breviary: 


We have in general two variable quantities, x, y, which will 
be collated with one another by observation, so that we can 
determine for each value of x, which may be considered as an 
abscissa, the corresponding ordinate y. Were the experiments 
or observations completely accurate, these ordinates would 
give a number of points through which a straight or curved 
line should be drawn. But as this is not so, the line deviates to 


45 


a greater or lesser extent from the observational points. It must 13Johann Heinrich Lambert, Beyträge 
therefore be drawn in such a way that it comes as near as zum Gebrauche der Mathematik und deren 
possible to its true position and goes, as it were, through the Anwendung (Berlin, 1765), as quoted in 


middle of the given points. !3 Laura Tilling, "Early Experimental 


of Science, 8 (1975), 204-205. 


Graphs,” British Journal for.the History 


46 GRAPHICAL PRACTICE 


Lambert drew a graphical derivation of the evaporation rate of 
water as a function of temperature, according to Tilling. The 
analysis begins with two time-series: DEF, showing the decreasing 
height of water in a capillary tube as a function of time, and Asc, 
the temperature. The slope of the curve DEF is then taken (note the 
tangent DEG) at a number of places, yielding the rate of evaporation: 


To complete the graphical calculus, the measured rate is plotted 
against the corresponding temperature in this relational graphic: 


Thus, by the early 1800s, graphical design was at last no longer 
dependent on direct analogy to the physical world—thanks to the 
work of Lambert and Playfair. This meant, quite simply but quite 
profoundly, that any variable quantity could be placed in rela- 
tionship to any other variable quantity, measured for the same 
units of observation. Data graphics, because they were relational 
and not ticd to geographic or time coordinates, became relevant 


J. Н. Lambert, “Essai d'hygrométrie ou 
sur la mesure de l'humidité," Mémoires 
de P Académie Royale des Sciences et Belles- 
Lettres... 1769 (Berlin, 1771), plate 1, 
facing p. 126; from Tilling's article. 


DEATHS PER MILLION 


to all quantitative inquiry. Indeed, in modern scientific literature, 
about 40 percent of published graphics have a relational form, 
with two or more variables (none of which are latitude, longitude, 
or time). This is no accident, since the relational graphic—in its 
barest form, the scatterplot and its variants—is the greatest of all 
graphical designs. It links at least two variables, encouraging and 
even imploring the viewer to assess the possible causal relationship 
between the plotted variables. It confronts causal theories that X 
causes Y with empirical evidence as to the actual relationship be- 
tween X and Y, as in the case of the relationship between lung 
cancer and smoking: 


CRUDE MALE DEATH RATE FOR LUNG CANCER 
IN 1950 AND PER CAPITA CONSUMPTION OF 
CIGARETTES IN 1930 IN VARIOUS COUNTRIES. 


NE mid Pam 


500 
GREAT BRITAIN e 


400 


300 


r=0.73 + 0.30 


| M 
IN 


200 


1000 1250 1500 


CIGARETTE CONSUMPTION 


GRAPHICAL EXCELLENCE 47 


Report of the Advisory Committee to 
the Surgeon General, Smoking and Health 
(Washington, D.C., 1964), p. 176; based 
on R. Doll, “Etiology of Lung Cancer," 
Advances in Cancer Research, 3 (1955), 
1-$0. 


48 GRAPHICAL PRACTICE 


These small-multiple relational graphs show unemployment and Paul McCracken, et al., Towards Full 
inflation over time in “Phillips curve" plots for nine countries, Employment and Price Stability (Paris, 


ч 1977), Р. 106. 
demonstrating the collapse of what was once thought to be an 
inverse relationship between the variables. 


Inflation and Unemployment Rates 


Per cent 


Canada United States Japan 
74 


Increase in CPI 


Mate unemployment rate 


United Kingdom France Germany 


25 75 


10 


Sweden 


74 


70 67 


THERMAL CONDUCTIVITY, W emî" K” 


Theory and measured observations diverge in the physical sci- 
ences, also. Here the relationship between temperature and the 
thermal conductivity of copper is assessed in a series of measure- 
ments from different laboratories. The connected points are from 
a single publication, cited by an identification number. The very 
different answers reported in the published literature result mainly 
from impurities in the samples of copper. Note how effectively 
the graphic organizes a vast amount of data, recording findings 
of hundreds of studies on a single page and, at the same time, 
enforcing comparisons of the varying results. 


GRAPHICAL EXCELLENCE 49 


[тит] 


9 


RECOMME! 


eee ee eee 


THERMAL CONDUCTIVITY OF 


sanm 


zx [RECOMMENDE 
Va 
ral wee ] 


A سء‎ 


TEMPERATURE, 


C. Y. Ho, R. W. Powell, and P. E. Liley, 
Thermal Conductivity of the Elements: A 
Comprehensive Review, supplement no. 
1, Journal of Physical and Chemical 
Reference Data, 3 (1974), 1-244. 


50 GRAPHICAL PRACTICE 


Finally, two relational designs of a different sort— wherein the E. C. Zeeman, “Catastrophe Theory,” 
data points are themselves data. Here the effect of two variables Scientific American, 234 (April 1976), 67; 
н Ё Я Ў based on Konrad Z. Lorenz, King 
interacting is portrayed by the faces on the plotting field: Solomon's Ring (New York, 1952). 

FEAR 
A 


5 RAGE 


And similarly, the varying sizes of white pine seedlings after 
growing for one season in sand containing different amounts of 
calcium, in parts per million in nutrient-sand cultures: 


H. L. Mitchell, The Growth and Nutrition 
af White Pine Seedlings in Cultures with 
Varying Nitrogen, Phosphorus, Potassium 
and Calcium, The Black Rock Forest 
Bulletin No. 9 (Cornwall-on-the- 
Hudson, New York, 1939), p. 70. 


GRAPHICAL EXCELLENCE 51 


Principles of Graphical Excellence 


Graphical excellence is the well-designed presentation of interesting 
data—a matter of substance, of statistics, and of design. 


Graphical excellence consists of complex ideas communicated 
with clarity, precision, and efficiency. 


Graphical excellence is that which gives to the viewer the greatest 
number of ideas in the shortest time with the least ink in the 
smallest space. 


ideas time 


Graphical excellence is nearly always multivariate. 


And graphical excellence requires telling the truth about the data. 


As to the propriety and justness of representing sums of money, and time, 
by parts of space, tho’ very readily agreed to by most men, yet a few seem 
to apprehend that there may possibly be some deception in it, of which 
they are not aware... . 


William Playfair, The Commercial and Political Atlas (London, 1786) 


People said: “With the chart on the wall, with the figures published, let's 
emulate and rouse our enthusiasm in production.” 


State Statistical Bureau of the People’s Republic of China, 
Statistical Work in the New China (Beijing, 1979) 


Get it right or let it alone. 
The conclusion you jump to may be your own. 


James Thurber, Further Fables for Our Time (New York, 1956) 


2 Graphical Integrity 


For many people the first word that comes to mind when they think 
about statistical charts is "lie." No doubt some graphics do distort 
the underlying data, making it hard for the viewer to learn the 
truth, But data graphics are no different from words in this regard, 
for any means of communication can be used to deceive. There 

is no reason to believe that graphics are especially vulnerable to 
exploitation by liars; in fact, most of us have pretty good graphical 
lie detectors that help us see right through frauds. 

Much of twentieth-century thinking about statistical graphics has 
been preoccupied with the question of how some amateurish chart 
might fool a naive viewer. Other important issues, such as the 
use of graphics for serious data analysis, were largely ignored. 

At the core of the preoccupation with deceptive graphics was the 
assumption that data graphics were mainly devices for showing 
the obvions to the ignorant. It is hard to imagine any doctrine 
more likely to stifle intellectual progress in a field. The assump- 
tion led down two fruitless paths in the graphically barren years 
from 1930 to 1970: First, that graphics had to be "alive," “сот- 
municatively dynamic," overdecorated and exaggerated (other- 
wise all the dullards in the audience would fall asleep in the face 
of those boring statistics). Second, that the main task of graphical 
analysis was to detect and denounce deception (the dullards could 
not protect themselves). 

Then, in the late 1960s, John Tukey made statistical graphics 
respectable, putting an end to the view that-graphics were only 
for decorating a few numbers. For hére was a world-class data 
analyst spinning off half a dozen new designs and, more impor- 
tantly, using them effectively to explore complex data.! Not a 
word about deception; no tortured attempts to construct more 
“graphical standards” in a hopeless effort to end all distortions. 
Instead, graphics were used as instruments for reasoning about 
quantitative information. With this good example, graphical work 
has come to flourish. | 

Of course false graphics аге still with us. Deception must always 
be confronted and demolished, even if lie detection is no longer 
at the forefront of research. Graphical excellence begins with telling 
the truth about the data. 


1John W. Tukey and Martin B. Wilk, 
“Data Analysis and Statistics: Techniques 
and Approaches,” in Edward R. Tufte, 
ed., The Quantitative Analysis of Social 
Problems (Reading, Mass., 1970), 370- 
390; and John W. Tukey, "Some 
Graphic and Semigraphic Displays,” in 
T. A. Bancroft, ed., Statistical Papers in 
Honor of George W. Snedecor (Ames, 
Towa, 1972), 293-316. 


$4 GRAPHICAL PRACTICE 


Here are several graphics that fail to tell the truth. First, the case 

of the disappearing baseline in the annual report of a company that 

would just as soon forget about 1970. A careful look at the middle 

panel reveals a negative income in 1970, which is disguised by 

having the bars begin at the bottom at approximately minus 

$4,200,000: Day Mines, Inc., 1974 Annual Report, p. 1. 


OPERATING 
REVENUES 


$7,382,539. 


| 
51 


| | 
1970 | 1971 | 1920991971 | 1878 


This pseudo-decline was created by comparing six months’ 
worth of payments in 1978 to a full year’s worth in 1976 and 1977, 
with the lie repeated four times over. | 


Commission Payments 
To Travel Agents 


In millions of dollars 


New York Times, August 8, 1978, p. D-1. 


GRAPHICAL INTEGRITY 55 


And sometimes the fact that numbers have a magnitude as well 
as an order is simply forgotten: 


Comparative Annual Cost per Capita for care of Insane in $213 $214 
Pittsburgh City Homes and Pennsylvania State Hospitals. 


$147 $172 


South: Mountain Pittsburgh Harrisburg 


Pittsburgh Civic Commission, Report on 
Expenditures of the Department of Charities 
(Pittsburgh, 1911), p. 7. 


What is Distortion in a Data Graphic? 


A graphic does not distort if the visual representation of the data 
is consistent with the numerical representation. What then is the 
"visual representation" of the data? As physically measured on the 
surface of the graphic? Or the perceived visual effect? How do we 
know that the visual image represents the underlying numbers? 

One way to try to answer these questions is to conduct experi- 
ments on the visual perception of graphics—having people look 
at lines of varying length, circles of different areas, and then 
recording their assessments of the numerical quantities. 


I think I see that area B 
is 3.14 times bigger than 
area A. Is that correct? 


Such experiments have discovered very approximate power laws 
relating the numerical measure to the reported perceived measure. 
For example, the perceived area of a circle probably grows some- 
what more slowly than the actual (physical, measured) area: 

the reported perceived area = (actual агеа)х, where x = .8+.3, 

a discouraging result. Different people see the same areas somewhat 


56 GRAPHICAL PRACTICE 


differently; perceptions change with experience; and perceptions 
are context-dependent.? Particularly disheartening is the securely 
established finding that the reported perception of something as 
clear and simple as line length depends on the context and what 
other people have already said about the lines? 

Misperception and miscommunication are certainly not special 
to statistical graphics, 


but what is a poor designer to do? A different graphic for each 
perceiver in each context? Or designs that correct for the visual 


transformations of the average perceiver participating in the aver- 


age psychological experiment? 


One satisfactory answer to these questions is to use a table to show 
the numbers. Tables usually outperform graphics in reporting on 
small data sets of 20 numbers or less. The special power of graphics 


comes in the display of large data sets. 
At any rate, given the perceptual difficulties, the best we can 


hope for is some uniformity in graphics (if not in the perceivers) 


and some assurance that perceivers have a fair chance of getting 


the numbers right. Two principles lead toward these goals and, in 


consequence, enhance graphical integrity: 


The representation of numbers, as physically 
measured on the surface of the graphic itself, 
should be directly proportional to the numerical 
quantities represented. 


Clear, detailed, and thorough labeling should be 
used to defeat graphical distortion and ambi- 

guity. Write out explanations of the data on the 
graphic itself. Label important events in the data. 


2The extensive literature is summarized 
in Michael Macdonald-Ross, “How 
Numbers Are Shown: A Review of 
Research on the Presentation of Quan- 
titative Data in Texts,” Audio-Visual 
Communication Review, 25 (1977), 359- 
409. In particular, H. J. Meihoefer finds 
great variability among perceivers; see 
“The Utility of the Circle as an Effective 
Cartographic Symbol,” Canadian Car- 
tographer, 6 (1969), 105-117; and “The 
Visual Perception of the Circle in The- 
matic Maps: Experimental Results,” 
ibid., 10 (1973), 63-84. 


25. E. Asch, “Studies of Independence 
and Submission to Group Pressure. A 
Minority of One Against a Unanimous 
Majority,” Psychological Monographs 
(1956), 70. 


Drawing by CEM; copyright 1961, 
The New Yorker. 


GRAPHICAL INTEGRITY 


Violations of the first principle constitute one form of graphic 
misrepresentation, measured by the 


Я size of effect shown in graphic 
Lie Factor = EO EE oe 


size of effect in data 


If the Lie Factor is equal to one, then the graphic might be doing 

a reasonable job of accurately representing the underlying num- 
bers. Lie Factors greater than 1.05 or less than .95 indicate sub- 
stantial distortion, far beyond minor inaccuracies in plotting. 

The logarithm of the Lie Factor can be taken in order to compare 
overstating (log LF > 0) with understating (log LF < 0) errors. In 
practice almost all distortions involve overstating, and Lie Factors of 
two to five are not uncommon. 

Here is an extreme example. A newspaper reported that the 
U.S. Congress and the Department of Transportation had set a 
series of fuel economy standards to be met by automobile manu- 
facturers, beginning with 18 miles per gallon in 1978 and moving 
in steps up to 27.5 by 1985, an increase of 53 percent: 


27.5 – 18.0 


—— х 100 = 53% 
18.0 


These standards and the dates for their attainment were shown: 


This line, representing 18 miles per 
gallon in 1978, is 0.6 inches long. 


Fuel Economy Standards for Autos 


Set by Congress and supplemented by the Transportation 
Department. In miles per gallon. 


This line, representing 27.5 miles per 
gallon in 1985, is 5.3 inches long. 


57 


New York Times, August 9, 1978, p. D-2. 


58 GRAPHICAL PRACTICE 


The magnitude of the change from 1978 to 1985 is shown in the 
graph by the relative lengths of the two lines: 


5.3 – 0.6 


100 = 783% 
“a” i 


Thus the numerical change of 53 percent is presented by some 
lines that changed 783 percent, yielding 


Lie Factor — a = 14.8 
53 


which is too big. 
The display also has several peculiarities of perspective: 


* On most roads the future is in front of us, toward the horizon, 
and the present is at our feet. This display reverses the conven- 
tion so as to exaggerate the severity of the mileage standards. 


* Oddly enough, the dates on the left remain a constant size on 
the page even as they move along with the road toward the 
horizon. 


* The numbers on the right, as well as the width of the road 
itself, are shrinking because of two simultaneous effects: the 
change in the values portrayed and the change due to perspec- 
tive. Viewers have no chance of separating the two. 


It is easy enough to decorate these data without lying: 


REQUIRED FUEL ECONOMY STANDARDS: 
NEW CARS BUILT FROM 1978 TO 1985 


27 27-5 
26 е—® 
24. 9 
22. ,"" 
20 Jd 
18 19 e 
vom 19.1 mpg, ii "pla 


average for all cars 
on road, 1985 

13.7 mpg, average 

for all cars on 

road, 1978 


1978 1979 1980 1981 1982 1983 1984 1985 
1 — سا‎ 1 A П 4 


The non-lying version, in addition, puts the data in a context by 
comparing the new car standards with the mileage achieved by 
the mix of cars actually on the road. Also revealed is a side of the 
data disguised and mispresented in the original display: the fuel 
economy standards require gradual improvement at start-up, 
followed by a doubled rate from 1980 to 1983, and flattening out 
after that. 

Sometimes decoration can help editorialize about the substance 
of the graphic. But it is wrong to distort the data measures—the 
ink locating values of numbers—in order to make an editorial 
comment or fit a decorative scheme. It is also a sure sign of the 
Graphical Hack at work. Here are many decorations but no lies: 


REQUIRED FUEL ECONOMY STANDARDS: 
NEW CARS BUILT FROM 1978 TO 1985 


19.1 mpg, expected ? 
average for all cars 
on road, 1985 

13.7 mpg, average 

fox all cars on 

road, 1978 


GRAPHICAL INTEGRITY 59 


60 GRAPHICAL PRACTICE 


Design and Data Variation 


Each part of a graphic generates visual expectations about its other 
parts and, in the economy of graphical perception, these expec- 
tations often determine what the eye sees. Deception results from 
the incorrect extrapolation of visual expectations generated at one 
place on the graphic to other places. 

A scale moving in regular intervals, for example, is expected 
to continue its march to the very end in a consistent fashion, with- 
out the muddling or trickery of non-uniform changes. Here an 
irregular scale is used to concoct a pseudo-decline. The first seven 
increments on the horizontal scale are ten years long, masking 
the rightmost interval of four years. Consequently the conspicuous 
feature of the graphic is the apparent fall of curves at the right, 
particularly the decline in prizes won by people from the United 
States (the heavy, dark line) in the most recent period. This effect 
results solely from design variation. It is a big lie, since in reality 
(and even in extrapolation, scaling up each end-point by 2.5 to 
take the four years’ worth of data up to a comparable decade), | | , 

ў В National Science Foundation, Science 

the U.S. curve turned sharply upward in the post-1970 interval. Indicators, 1974 (Washington, D.C., 
A correction, with the actual data for 1971-80, is at the right: 1976), p. 15. 


Nohel Prizes Awarded in Science, Nobel Prizes Awarded in Science, 
for Selected Countries, 1901-1974 for Selected Countries, 1901-1980 


(Number of Prizes) (Number of Prizes) $ 
30 30 71 
25 - 25 [- 
United States United States 
20 — 20 — 
United Kingdom United Kingdom 
p eo” Pee 
. . . . LIE 
10 H°, Germany M е 10 
ЫЙ 
e 

5 5 

0 0 

1901- 1911- 1921- 1931- 1941- 1951- 1961- 1971- 1901- 1911- 1921- 1931- 1941- 1951- 1961- 1971- 


1910 1920 1930 1940 1950 1960 1970 1974 1910 1920 1930 1940 1950 1960 1970 1980 


GRAPHICAL INTEGRITY 61 


The confounding of design variation with data variation over the 
surface of a graphic leads to ambiguity and deception, for the eye 
may mix up changes in the design with changes in the data. A 
steady canvas makes for a clearer picture. The principle is, then: 


Show data variation, not design variation. 


Design variation corrupts this display: 


Oct. 1,2. 
OPEC Oil Prices: After 18 па op 
Months of Stability, Prices Are July 1, 2.284% p 
Due to Rise Again 


increase =~ ef 3 
April 1, 3.809% E 
i — 
Dollars per barrel тез AD 


Jan. 1, 5% 
increase 


Quarterly New York Times, December 19, 1978, 
p. 0-7. 


'73 "74 '75 °76 ‘77 "78 ~ 


The New York Times/ Dec. 19, 1978 


Five different vertical scales show the price: 


During this time one vertical inch equals 
1973-1978 $8.00 
January-March 1979 $4.73 
April-June 1979 $4.37 
July-September 1979 $4.16 


October-December 1979 $3.92 


And two different horizontal scales show the passage of time: 


During this time one horizontal inch equals 
1973-1978 3.8 years 
1979 0.57 years 


As the two scales shift simultaneously, the distortion takes on 
multiplicative force. On the left of the graph, a price of $10 for 
one year is represented by 0.31 square inches; on the right side, 
by 4.69 square inches. Thus exactly the same quantity is 4.69/0.31 
—15.1 times larger depending upon where it happens to fall on 
the surface of the graphic. That is design variation. 


1 


Middle Eost wor, 
$248 $2.57 
$218 
24.51.80 | 


62 GRAPHICAL PRACTICE 


Design variation infected similar graphics in other publications. 
Here an increase of 454 percent is depicted as an increase of 4,280 
percent, for a Lie Factor of 9.4: 


IN THE BARREL... 
Price per bbl. of — 
light crude, leaving. 
“Saudi Arabia 
on Jan, 1 


And an increase of 708 percent is shown as 6,700 percent, for a 
Lie Factor of 9.5: 


EXE ion es 


OPEC prices sie set for the so-called benchmark Neutly oll oil in the non: id, i к 
T E em а des pe oan werd OPEC Benchmark Prices 
Ор 


high quality cil thot is more entity refined, Ас such өз Mexico or Nerway, is sold at the 


tal marke? prices, howsver, widely byas 1e1 price, Most of this eil it sold under long-term. 
тй БАЙ o we depen, oa ойу сатан, at pine Mi Waki on ponere hai 1970-1979 
kotien, аз wi ral ona “spar” market, ran did this recently bring- 
тотен conditions, The benthmork peice, never. ing pricas сз high et 518 а barrel. Depending $1270 $1334. 
theless, is the genero! guide оўой which оп morker conditions, ax much өз 5 par tent of i СЫ 
ether prices ага determined, the world's export are veld on a spot Беліз, (prices frozen} 
$12.09 
(March 1974— ms ooo 
Dollars per barre! end of embargo) 
У 
a E 
Б 9 
7 4 K 
4 Ы KI 
р 2 [^ 
p? 
2 cf} л 
(Qa. 1973 НТ (У a 
û, tort of embargo fied to Н И 
И [ | 
ПП ol 
| И. 


{ д 


E 


Jan. 7 Jon | Jaw. | Jon. 
1970 1971 1972 1973 


All these accounts of oil prices made a second error, by showing 
the price of oil in inflated (current) dollars. The 1972 dollar was 
worth much more than the 1979 dollar. Thus in sweeping from 


Time, April 9, 1979, p. 57. 


Washington Post, March 28, 1979, p. 
A-18. 


left to right over the surface of the graphic, the vertical scale in 
effect changes—design variation—because the value of money 
changes over the years shown. The only way to think clearly about 
money over time is to make comparisons using inflation-adjusted 
units of money. Several distinguished graphic designers did ex- 
press the price in real dollars—and they also avoided other sources 
of design variation. The stars were Business Week, the Sunday 
Times (London), and. The Economist. 


The price of crude oil 


1972 2100 
NOMINAL! 


In the graphic we saw first, the two sources of design variation 
covered up an intriguing, non-obvious aspect of the data: in the 
four years prior to the 1979-1980 increases, the real price of oil had 
declined. Busy with decoration, the graphic had missed the news. 


Qc 1,250118. 
OPEC OilPrices. After 18 > шад 
Months of Stability, Prices Ате suns „шш 
Due to Rise Again —-—PÀÀ 


Dollars per barrel 


баз 
Dee, 


GRAPHICAL INTEGRITY 63 


Soft touch 


OECD area, 1972 = 100 


Nominal price of 
imported oil* 


Real price of 
imported oil БИР 


de 


/ 
/ 
/ Real price of energy 
to final users 


Ratio of energy use to gdp 


The Economist, December 29, 1979, p. 41. 


Sunday Times (London), December 16, 
1979, p. 54. 


Business Week, April 9, 1979, р. 99. 


| SS à Ao E 


| 


uteg Of Cotedortes 


"EMT of te. p zsh Wor _ 


EXE T 


// bay z e 7; T4 
AD 


p 
IN 
8j 
Ке 
N 


JAA 


> 


AMERI C 


lo в. Crack 


ae on lof Queen Ann 
| | 

® э, 
hatin de Zh Lad Pia Willian 


N 


| (BRITAIN fom 


Lhe Divisions at the Bottom are Years, Xthose on the fight hand Money. 


GRAPHICAL INTEGRITY 


The Case of Skyrocketing Government Spending 


Probably the most frequently printed graphic, other than the daily 
weather map and stock-market trend line, is the display of gov- 
ernment spending and debt over the years. These arrays nearly 
always create the impression that spending and debt are rapidly 
increasing. 

As usual, Playfair was the first, publishing this finely designed 
graphic in 1786. Accompanied by his polemic against the “ruinous 
folly” of the British government policy of financing its colonial 
wars through debt, it is surely the first skyrocketing government 
debt chart, beginning the now 200-year history of such displays. 
This is one of the few Playfairs that is taller than wide; less than 
one-tenth of all his graphics (about 90, drawn during 35 years of 
work) are longer on the vertical. The tall shape here serves to 
emphasize the picture of rapid growth. The money figures are 
not adjusted for inflation. 

But Playfair had the integrity to show an alternative version a 
few pages later in The Commercial and Political Atlas. The interest on 
the national debt was plotted on a broad horizontal scale, dimin- 
ishing the skyrocket effect. And, furthermore, “This is in real and 
not in nominal millions" (page 129): 


Interest of the NATIONAL DEBT from the Revolution. 


P OS ok е оч d uw й 


2714 Gn: 1727 9 3 E 
The Bottom ёле 1s Kars those on the Right hand Millions of Pounds. 


65 


66 GRAPHICAL PRACTICE 


Although Playfair deflated money units over time in his work of 
1786, the matter has proved difficult for many, eluding even mod- 
ern scholars. This display helps its political point along by failing 
to discount for inflation and population growth and by using a 
tall and thin shape (the area covered by the data is 2.7 times taller 
than wide): 


Figure A3. The Growth of Government: Federal Spending in Se- 
lected Domestic Areas 


Billions 
of Dollars 


Defense 


p Income Security 


40r 


20r 


, Health 


П 
10 ^ i, Veterans 
РА 7 7 Education 


TEES 1 21 1 L 


inp. 
1930 1940 1950 1960 1970 


Let us look, in detail, at another graphic on government spending: 


New York State 

Total Budget Expenditures and 

Aid to Localities i» billions of dollars Sis 
Fiscal 1966-1976 


Total Budget —> 


Total Aid їо —> 
Localities* 


*Varying from a low 
of 56.7 percent of 
the total in 1970-71 


to a high of 60.7 ? 
percent in 1972-73 1966- 67. 5% 6% MW TM FM FG FF 75 7 
67 `68 769 70 т 72 73 4 75 76 п 

t 1 


Estimated Recommended 


Morris Fiorina, Congress: Keystone of the 
Washington Establishment (New Haven, 


1977), Р. 92. 


New York Times, February 1, 1976, p. 
1V-6. 


Despite the appearance created by the hyperactive design, the state 
budget actually did not increase during the last nine years shown. 
To generate the thoroughly false impression of a substantial and 
continuous increase in spending, the chart deploys several visual 
and statistical tricks—all working in the same direction, to exag- 
gerate the growth in the budget. These graphical gimmicks: 


This cluster of type emphasizes and 
stretches out the low value for 1966- 
1967, encouraging the impression that 
recent years have shot up from a small, 
stable base. Horizontal axrows provide 


similar emphasis. 
Total Budget — 


Total Aid {o —> 
Localities* 

*Varying from a low 
0156.7 percent of 


the total in 1970-71 
10 a high of 80.7 


percent in 1972-73 1986- `®7- '68- '68- 70- nu 72- 


з в ы т n T в 
This squeezed-down block of type 

contributes to an image of small, 

squeezed-down budgets back in the 

good old days. 


Leaving behind the distortion in the chartjunk heap at the left 
yields a calmer view: 


vil 


1 T 


GRAPHICAL INTEGRITY 67 


These three parallelepipeds have been 
placed on an optical plane in front 

of the other eight, creating the image 
that the newer budgets tower over the 
older ones. 


$10.7 $108 


74- 75 76- 


15 в т 
T T 


Estimated Recommended 
Arrows pointing straight up emphasize 
recent growth. Compare with horizontal 
arrows at left. 


68 GRAPHICAL PRACTICE 


Two statistical lapses also bias the chart. First, during the years 
shown, the state’s population increased by 1.7 million people, or 
10 percent. Part of the budget growth simply paralleled population 
growth. Second, the period was a time of substantial inflation; 
those goods and services that cost state and local governments 
$1.00 to purchase in 1967 cost $2.03 in 1977. By not deflating, the 
graphic mixes up changes in the value of money with changes in 
the budget. 

Application of arithmetic makes it possible to take population 
and inflation into account. Computing expenditures in constant 
(real) dollars per capita reveals a quite different—and far more 
accurate— picture: 


Per capita 
budget expenditures, 
in constant dollars 


$400 ES *oMMM 
е кш 
УМ. ш 0 
е P usi E P з n | а 


9380 - Pd ернек ане ични: Lo 


$360 - 


$340 - 


$320 - ГА 


$300 - 


1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 


Thus, in terms of real spending per capita, the state budget in- 
creased by about 20 percent from 1967 to 1970 and remained 
relatively constant from 1970 through 1976. And the 1977 budget 
represents a substantial decline in expenditures. That is the real news 
story of these data, and it was completely missed by the Graph of 
the Magical Parallelepipeds. Of course no small set of numbers is 
going to capture the complexities of a large budget— but, at any 
rate, why tell lies? 


The principle: 


In time-series displays of money, deflated and 
standardized units of monetary measurement are 
nearly always better than nominal units. 


New York State 

Total Budget Expenditures and 
Aid to Localities ın titions of dollars 
Fiscal 1965-1976 


Total Aid to —> 
Localities" 
“Varying trom a tow 
ot $67 percent of 
the betel in 1970-71 
to а high of 60.7 
percem in 1972-13 


GRAPHICAL INTEGRITY 69 


Visual Area and Numerical Measure 


Another way to confuse data variation with design variation is to use 
areas to show one-dimensional data: 


Accroissement de nos 
n i 
exporlalions d’ autos 


1927-1929 8.217 


ET 


SS 


С 


ЛС 


Z 


N 
КЎ 
zi 


R. Satet, Les Graphiques (Paris, 1932), 
p. 12. 


Indochine Maroc Tunisie Algérie 


And here is the incredible shrinking doctor, with a Lie Factor of 
2.8, not counting the additional exaggeration from the overlaid 
perspective and the incorrect horizontal spacing of the data: 


THE SHRINKING FAMILY DOCTOR 


In California 


Percentage of Doctors Devoted Solely to Family Practice 
Я 1975 1990 


ey _ Los Angeles Times, August 5, 1979, p. 3. 
1: 2,247 RATIO 10 POPULATION 
8,023 Doctors 


70 GRAPHICAL PRACTICE 


Many published efforts using areas to show magnitudes make 
the elementary mistake of varying both dimensions simultaneously 
in response to changes in one-dimensional data. Typical is the 
shrinking dollar fallacy. To depict the rate of inflation, graphs 
show currency shrinking on two dimensions, even though the 
value of money is one-dimensional. Here is one of hundreds of 
such charts: 


Washington Post, October 25, 1978, p. 1. 


| 


ху" 


Source: Labor Department 


1978 — CARTER: 446 


(August) — 


If the area of the dollar is accurately to reflect its purchasing power, 
then the 1978 dollar should be about twice as big as that shown. 


GRAPHICAL INTEGRITY 71 


There are considerable ambiguities in how people perceive a two- 
dimensional surface and then convert that perception into a one- 
dimensional number. Changes in physical area on the surface of a 
graphic do not reliably produce appropriately proportional changes 
in perceived areas. The problem is all the worse when the areas 


| IN THE BARREL... 
Price per bbl. of ` 
„light crude, leaving 
"Saudi Arabia 


are tricked up into three dimensions: 


By surface area, the Lie Factor for this graphic is 9.4. But, if one 
takes the barrel metaphor seriously and assumes that the volume of 
the barrels represents the price change, then the volume from 1973 
to 1979 increases 27,000 percent compared to a data increase of 
454 percent, for a Lie Factor of 59.4, which is a record. 

Similarly, a three-dimensional representation puffing up 
one-dimensional data: 


New York Times, January 27, 1981, 
p. D-1. 


Conclusion: The use of two (or three) varying dimensions to 
show one-dimensional data is a weak and inefficient technique, 
capable of handling only very small data sets, often with error in 
design and ambiguity in perception. These designs cause so many 
problems that they should be avoided: 


The number of information-carrying (variable) 
dimensions depicted should not exceed the 
number of dimensions in the data. 


CASSE POSTALI DI RISPARMIO ITALIANE 
Numero dei Libretti, Libretto medio e Deposito totale 
al fine di ogni mese 


DICEMBRE 


SETTEMBRE 


This multivariate history of the Italian post office uses two di- 
mensions in a way nearly consistent with this principle, with the 
number of postal savings books issued and the average size of 
deposits multiplying up to total dep osits at the end of each month Antonio Gabaglio, Teoria Generale della 


from 1876 to 1881. Statistica (Milan, second edition, 1888). 


GRAPHICAL INTEGRITY 73 


But Playfair’s circles, an early use of area to show magnitude, 
are not consistent with the principle, since the one-dimensional 


data (city populations) are represented by an areal data measure: 


Perhaps graphics that border on cartoons should be exempt Scientific American Reference Book (New 
from the principle. We certainly would not want to forgo the York, 1909), p. 280. 
4,340 pound chicken: 


325 


300 


275 


325 


300 


275 


250 


225 


74 GRAPHICAL PRACTICE 


Context is Essential for Graphical Integrity 


To be truthful and revealing, data graphics must bear on the ques- 
tion at the heart of quantitative thinking: "Compared to what?" 
The emaciated, data-thin design should always provoke suspicion, 
for graphics often lie by omission, leaving out data sufficient for 
comparisons. The principle: 


Graphics must not quote data out of context. 


Nearly all the important questions are left unanswered by this 


display: 


= e Before stricter ` Connecticut Traffic Deaths, 
enforcement Before (1955) and After (1956) 
Stricter Enforcement by the Police 
Against Cars Exceeding Speed limit 


> After stricter 
enforcement 


1955 1956 
A few more data points add immensely to the account: 


e Connecticut Traffic Deaths, 
1951-1959 


VN М 


pow p pe une рр ару 
1951 1953 1955 1957 1959 


16 


14 


12 


10 


Imagine the very different interpretations other possible time- 
paths surrounding the 1955-1956 change would have: 


MIVA 
Aw 


Comparisons with adjacent states give a still better context, reveal- 
ing it was not only Connecticut that enjoyed a decline in traffic 
fatalities in the year of the crackdown on speeding: 


Traffic Deaths per 100,000 
Persons in Connecticut, 
Massachusetts, Rhode Island, 
and New York, 1951-1959 


New York 


Massachusetts 


Connecticut 


Rhode Island 


GRAPHICAL INTEGRITY 7$ 


Donald T. Campbell and H. Laurence 
Ross, “Тһе Connecticut Crackdown on 
Speeding: Time Series Data in Quasi- 
Experimental Analysis,” in Edward R. 
Tufte, ed., The Quantitative Analysis of 
Social Problems (Reading, Mass., 1970), 
110-125. 


76 GRAPHICAL PRACTICE 


Conclusion 


Lying graphics cheapen the graphical art everywhere. Since the 
lies often show up in news reports, millions of images are printed. 
When a chart on television lies, it lies tens of millions of times 
over; when a New York Times chart lies, it lies 900,000 times over 
to a great many important and influential readers. The lies are told 
about the major issues of public policy —the government budget, 
medical care, prices, and fuel economy standards, for example. 
The lies are systematic and quite predictable, nearly always exag- 
gerating the rate of recent change. 

The main defense of the lying graphic 15... “Well, at least it 
was approximately correct, we were just trying to show the gen- 
eral direction of change." But many of the deceptive displays we 
saw in this chapter involved fifteenfold lies, too large to be de- 
scribed as approximately correct. And in several cases the graphics 
were not even approximately correct by the most lax of standards, 
since they falsified the real news in the data. It is the special char- 
acter of numbers that they have a magnitude as well as an order; 
numbers measure quantity. Graphics can display the quantitative 
size of changes as well as their direction. The standard of getting 
only the direction and not the magnitude right is the philosophy 
that informs the Pravda School of Ordinal Graphics. There, every 
chart has a crystal clear direction coupled with fantasy magnitudes. 


Рост продукции промышленности [1922 г. = I]. 


Pravda, May 24, 1982, p. 2. 


A second defense of the lying graphic is that, although the de- 
sign itself lies, the actual numbers are printed on the graphic for 
those picky folks who want to know the correct size of the effects 
displayed. It is as if not lying in one place justified fifteenfold lies 
elsewhere. Few writers would work under such a modest standard 
of integrity, and graphic designers should not either. 


Graphical integrity is more likely to result if these six principles 
are followed: 


The representation of numbers, as physically measured on the 
surface of the graphic itself, should be directly proportional to 
the numerical quantities represented. 


Clear, detailed, and thorough labeling should be used to defeat 
graphical distortion and ambiguity. Write out explanations of 
the data on the graphic itself. Label important events in the data. 


Show data variation, not design variation. 


In time-series displays of money, deflated and standardized units 
of monetary measurement are nearly always better than nominal 
units. 


The number of information-carrying (variable) dimensions 
depicted should not exceed the number of dimensions in the 
data. 


Graphics must not quote data out of context. 


GRAPHICAL INTEGRITY 


77 


3 Sources of Graphical Integrity and Sophistication 


Why do artists draw graphics that lie? Why do the world’s major 
newspapers and magazines publish them?! 

Although bias and stereotyping are the origin of more than a 
few graphical distortions, the primary causcs of inept graphical 
work are to be found in the skills, attitudes, and organizational 
structure prevailing among those who design and edit statistical 


graphics. 


Lack of Quantitative Skills of Professional Artists 


Lurking behind the inept graphic is a lack of judgment about quan- 
titative evidence. Nearly all those who produce graphics for mass 
publication are trained exclusively in the fine arts and have had 
little experience with the analysis of data. Such experience is essen- 
tial for achieving precision and grace in the presence of statistics, 
but even textbooks of graphical design are silent on how to think 
about numbers. Illustrators too often see their work as an exclu- 
sively artistic enterprise—the words "creative," "concept," and 
"style" combine regularly in all possible permutations, a Big Think 
jargon for the small task of constructing a time-series a few data 
points long. Those who get ahead are those who beautify data, 
never mind statistical integrity. 


The Doctrine That Statistical Data Are Boring 


Inept graphics also flourish because many graphic artists believe 
that statistics are boring and tedious. It then follows that decorated 
graphics must pep up, animate, and all too often exaggerate what 
evidence there is in the data. For example: 


* Time’s first full-time chart specialist, an art-school graduate, 
says that in his work, “The challenge is to present statistics as 
a visual idea rather than a tedious parade of numbers.”? 


* The opening sentence of the chapter on statistical charts in Jan 
White's Graphic Idea Notebook: “Why are statistics so boring?" 
Sample illustrations supposedly reveal "Dry statistics turned 


1 “It is difficult to know why these same 
errors are being repeated. In Playfair’s 
original work these kinds of mistakes 
were not made; moreover, these errors 
were not as widespread in the 19305 as 
they are now. Perhaps the reason is an 
increase in the perceived need for graphs 

. . without a concomitant increase in 
training in their construction. Evidence 
gathered by the committee on graphics 
of the American Statistical Association 
indicates that formal training in graphic 
presentation has had a marked decline 
at all levels of education over the last 
few decades." Howard Wainer, “Мак- 
ing Newspaper Graphs Fit to Print,” in 
Paul A. Kolers, et al., eds. Processing of 
Visible Language 2 (New York, 1980), 
р. 139. 


2 Time, February 11, 1980, p. 3. 


80 GRAPHICAL PRACTICE 


into symbolic graphics” and “Plain statistics embellished or 3Jan V. White, Graphic Idea Notebook 
3 (New York, 1980), pp. 148, 165. 


humanized with pictures." 
* A fine book on graphics, Herdeg’s Graphis / Diagrams, is described 

by its publisher: "An international review demonstrating con- 

vincingly that statistical and diagrammatic graphics do not 


necessarily have to be dull."^ 4-Walrer Herdeg, ed., Graphis/ Diagrams 
(Zurich, 1976). 


The doctrine of boring data serves political ends, helping to 
advance certain interests over others in bureaucratic struggles for 
control of a publication's resources. For if the numbers are dull 
dull dull, then an artist, indeed many artists, indeed an Art De- 
partment and an Art Director are required to animate the data, 
lest the eyes of the audience glaze over. Thus the doctrine encour- 
ages placing data graphics under control of artists rather than in 
the hands of those who write the words and know the substance. 
As the art bureaucracy grows, style replaces content. And the word 
people, having lost space in the publication to data decorators, 
console themselves with thoughts that statistics are really rather 
tedious anyway. 

If the statistics are boring, then you've got the wrong numbers. 
Finding the right numbers requires as much specialized skill— 
statistical skill —and hard work as creating a beautiful design or 
covering a complex news story. 


The Doctrine That Graphics Are Only for the 
Unsophisticated Reader 


Many believe that graphical displays should divert and entertain 
those in the audience who find the words in the text too difficult. 
For example: 


* Consumer Reports describes the design of their new consumer 
magazine for children: “For the first test issue, CU's profes- 
sional staff produced an article about sugar that was longer on 
graphics than on information. We had feared children might be 
overwhelmed Бу too many facts.’ 5 Consumer Reports, 45 (July 1980), 408. 


An art director with overall responsibility for the design of 

some 3,000 data graphics cach year (yielding 2.5 billion printed 

images) said that graphics are intended more to lure the reader’s 

attention away from the advertising than to explain the news $Louis Silverstein, “Graphics at the New 


А iy xw А ‚ 5 sg e York Times,” presentation at the First 
in any detail. “Unlike the advertisements,” he said, “at least General Conference on Social Graphics, 


we don't put naked women in our graphics.’”6 Leesburg, Virginia, October 23, 1978. 


SOURCES OF INTEGRITY AND SOPHISTICATION 81 


* A news director at a national television network said that 7Interview with author, July 1980. 
graphics must be instantly understandable: “If you have to 
explain it, don’t use и”? 


This kind of graphical thinking leads to 


The Company Cafeteria was used by 9 Out of 10 
Employees during the Fiscal Year 1949 


Mary Eleanor Spear, Charting Statistics 
(New York, 1952), p. 5, who appro- 
priately describes this as an "unnecessary 


Source: COMPANY REPORTS uu 
chart. 


The Consequences 


What E. B. White said of writing is also true of statistical graphics: 
"No one can write decently who is distrustful of the reader's 


intelligence, or whose attitude is patronizing. 8 Contempt for 8In William Strunk, Jr., and E. B. White, 
graphics and their audience, along with the lack of quantitative The Elements of Style (New York, 1959), 


skills among illustrators, has deadly consequences for graphical dd 
work: over-decorated and simplistic designs, tiny data sets, and 
big lies. 
Like censorship, these constraints on graphical design lead to New York Times, June 16, 1980, p. 4-18. 


elliptical and eccentric communication. In seeking to avoid 


ү s Allen D. Manvel, “Тахай dE 
the subtleties of the scatterplot, artists drew up these convoluted = Ee a a ee, 


nomic Growth,” Taxation with Repre- 


specimens, forcing bivariate data into a univariate design: sentation Newsletter, 9 (June 1980), p. 3. 
M I tanao Өш COATS T Views on the Economy influence Garter Support 


: Pervenlageof Democratic voters saying that the! financial sRuaticn Ia worse than a year здо 
[ae | Percentage of Democratic primary vote won by Carter 


oe Fla. ш — Wis. NH. Calf. Pa Ohio NY. Mass. Nw. 


LI 35 65 LI LES 42 48 LIU 49241 ur 


s us те БЫ саб воне. [S 
Lemma Croat баш. 1733878 {Average boom! arene б vna per сарин СТУ ro per T eer vm epee rr phi : Poite. і 


82 GRAPHICAL PRACTICE 


But beyond reviewing a few examples, let us look more sys- 
tematically at the level of graphical sophistication prevailing at 
different publications. In order to make comparisons among a 
variety of newspapers, magazines, scientific journals, and books, 

I have compiled a rough measure of graphical sophistication—the 
share of a publication’s graphics that are relational. Such a design 
links two or more variables but is not a time-series or a map. 
Relational graphics are essential to competent statistical analysis 
since they confront statements about cause and effect with evi- 
dence, showing how one variable affects another. The design idea 
is a simple one, although not quite as simple as the bar chart, time- 
series plot, or data map. Relational graphics have been used since 
1765 and are printed billions of times and ways every year; and 
there is evidence that twelve-year-old children understand the 
design.? 

All these graphics count as sophisticated by our hardly 
demanding measure: ® 


(SS61 7496Ê me att ЕН 20 HERE 


80 
140 160 180 200 220 240 260 280 
IUE T SR) RA E E (1965/1955) 


Шешә лл ага d 
6 18 20 22 24 26 28 30 32 34% 
KEM He R ER 


The frequency of use of relational designs was counted for 
randomly selected issues from 1974 to 1980 of each of 15 news 
publications. A total of about 4,000 graphics were examined in 
sampled issues. Scaling up the observed data by the frequency and 
circulation of the publication indicates that the sample represents a 
population of 250 to 300 billion printed graphical images. 


9 Clara Francis Bamberger, “‘Interpre- 
tation of Graphs at the Elementary 
School Level,” Catholic University of 
America Educational Research Monographs, 
13 (May 1942). Additional data from 
textbooks and standardized tests are 
presented shortly. 


10A variety of measures of graphical 
intelligence and complexity are possible 
and another, data density, is discussed in 
Chapter 8. 


A CZECHOSLOVAKIA A FRANCE О GERMANY 
© GREECE E] ISRAEL W UNITED STATES 


р Prague A= Munich” 
- Athens @ E 
M Bastia 4 — Brno; тоок уя 
&- Netanya Dem DlJerusalem ~ 
$4 Salted Мен Haven 

E. ; Iraklion = 
E „I Раусто Corts д, Dimond 

$2 tea 


заи u uul srr 
0 1.000 10,000 100,000 1,000,000 
Population 


Ths New York Times/Feb, 29, 1976 


Pace of City Life Found 
2.8 Feet per Second Faster 


By BOYCE RENSBERGER 

The pace of life in big 
cities is faster than it is in 
small towns—about 2.8 feet 
per second faster, according 
to a study by a Princeton 
University psychologist and 
his wife, who is an anthro- 
pologist. 

By measuring how fast 
people walk along the main 
streets of municipalities of 
varying sizes, they have con- 
firmed what most people 
have sensed informally. The 
bigger the city, the faster its 
inhabitants walk. 

They found, for example, 
that on Flatbush Avenue in 
Brooklyn, peopie walk at a 
brisk 5 feet per second, only 
a little slower than their 
counterparts on Wenceslas 
Square in Prague, who bustle 
along at 5.8 feet per second. 

Incontrast to Brooklyn and 
Prague, both of which have 
a population of more than a 
million, the 365 citizens of 
Psychro, Greece, amble along 
at 2.7 feet per second and the 
people of Corte, France (pop- 
ulation 5,500, move at 3.3 
feei per second, 


New York Times, February 29, 1976, 
р. 46. 


Isao Sato and Miyohei Shinohara, New 
Politics and Economics (Tokyo, 1974), p. 
113; a Japanese high school textbook. 


SOURCES OF INTEGRITY AND SOPHISTICATION 


Table 1 shows the results, ranking the 15 news publications by 
graphical sophistication. Seven of the papers, from Pravda to the 
Wall Street Journal, produced no relational graphics among those 
sampled and usually limited themselves to time-series. Other papers 
published more advanced graphics: the Japanese Asahi (a mass cir- 
culation daily), Akahata (“Red Flag," a Communist party paper 
that appears, from the data, to have employed a sophisticated and 
talented graphical designer in 1979), and Nihon Keizai (a financial 
daily), as well as Der Spiegel and The Economist. Although none 
reached the level of sophistication found in displays of scientific 
data (a random sample of 220 graphics from Science 1978-1980 had 
42 percent of relational design), it is clear that some graphical 
intelligence is possible in news work, at least in Japan and at a few 
European weeklies. 


Table 1 
Graphical Sophistication, World Press, 1974-1980 


Percentage of statistical 
graphics based on more Number of 
than one variable, but not graphics in 


a time-series or a map sample 
Akahata (“Кеа Flag") (Japan, daily, 9.3% 202 
circulation 30,000) 
Asahi Shimbun (Japan, daily, 8,000,000) 7.6% 119 
Der Spiegel (Germany, weekly, 1,000,000) 5.7% 454 
The Economist (Britain, weekly, 170,000) 2.0% 342 
Nihon Keizai Shimbun (Japan, daily 1.7% 297 
financial paper, 1,700,000) 
Le Monde (French, daily, 440,000) 0.796 144 
Business Week (U.S., weekly, 800,000) 0.696 726 
New York Times (U.S., daily, 900,000; 0.5% 422 
Sunday, 1,500,000) 
Pravda (USSR, daily, 10,500,000) 0.0% 54 
Frankfurter Allgemeine (Germany, daily, 0.0% 93 
300,000) 
The Times (Britain, daily, 400,000) 0.0% 107 
Washington Post (U.S., daily, 600,000; 0.0% 127 
Sunday, 800,000) 
Time (U.S., weekly, 4,300,000) 0.0% 147 
Die Zeit (Germany, weekly, 300,000) 0.0% 213 


Wall Street Journal (U.S., daily, 2,000,000) 0.0% 449 


83 


84 GRAPHICAL PRACTICE 


Japanese graphical distinction is consistent with that country’s 11 Andrew H. Malcolm, “Data-Loving 
heavy use of statistical techniques in the workplace and extensive Japanese Rejoice op Statisties Diy; Мар 
oe e K York Times, October 28, 1977, p. А-1. 
quantitative training, even in the early years of school: 


... по nation ranks higher in its collective passion for statistics. 
In Japan, statistics are the subject ofa holiday, local and national 
conventions, awards ceremonies and nationwide statistical 
collection and graph-drawing contests. “This year," said 
Yoshiharu Takahashi, a Government statistician, “we had 
almost 30,000 entries. Actually, we had 29,836." 

Entries in the [children’s] statistical graph contest were 
screened three times by judges, who gave first prize this year 
to the work of five 7-year-olds. Their graph creation, titled 
"Mom, play with us more often," was the result of a survey 
of 32 classmates on the frequency that mothers play with their 
offspring and the reasons given for not doing so. . . . Other 
children's work examined the frequency of family phone usage 
and correlated the day's temperature with cicada singing.!! 


Note the relational design of the last children's graphic mentioned. 


The five U.S. publications examined rank toward the bottom 
of the world list, along with Pravda and a few European papers. 
Note, in Table 1, the complete dominance of non-relational 
designs at the lower-ranked newspapers and magazines. This is 
unfortunate because the relational graphic, unlike the simpler 
designs, is an explanatory graphic— surely a natural for news reporting 
and analysis. 

The statistical graphics found in college and even high school 
textbooks are more sophisticated than those in news publications. 
Indeed, grade school children may experience a greater density of 
relational graphics than someone who reads only Business Week, 
the New York Times, Time, the Wall Street Journal, and the Wash- 
ington Post. Tables 2 and 3 record the graphical sophistication of 
textbooks and of a variety of standardized educational tests. 

A comparison between these data and Table 1 suggests that most 
news publications outside of Japan operate at a pre-adult level of 


intelligence in graphical design.!? | 12 Readers of news publications, particu- 
: larly the elite press, have considerable 

educational and professional attainments, 
with the resulting graphical skills. About 
80 percent of the 1.5 million readers of 
the Sunday New York Times attended 
college, according to a 1980 Times mar- 
ket survey. The audience for statistical 
graphics is smarter than many illustrators 
believe. 


SOURCES OF INTEGRITY AND SOPHISTICATION 85 


Table 2 
Graphical Sophistication, College and High School Textbooks 


Percentage of statistical 

graphics based on more 

than one variable, but not Number of 
a time-series or a map graphics 


COLLEGE TEXTBOOKS: 


Medicine and public health: 11 articles in 82% 17 
Judith Tanur, et al., Statistics: A Guide to 
the Unknown 


Introduction to Psychology, by Ernest 6896 82 
Hilgard, et al. 

General Chemistry, by Linus Pauling 6696 53 
Life on Earth, by Edward Wilson, et al. 4796 59 
Weather, astronomy, engineering: 7 4496 9 


articles in Tanur, Statistics: A Guide 
to the Unknown 


Communication, work, education, 4396 35 
economics: 20 articles in Tanur, 
Statistics: A Guide to the Unknown 


Political Behavior of the American 42% 43 
Electorate, by William H. Flanigan 
and Nancy H. Zingale 


Economics, by Paul Samuelson 16% 57 
Democracy in America, by Robert A. Dahl 8% 25 
American Government, by James Q. Wilson 0% 39 


HIGH SCHOOL TEXTBOOKS: 


Chemical Principles, by William Masterton 77% 27 
and Emil Slowinski 

The Project Physics Course, by Harvard 48% 33 
Project Physics 

New Politics and Economics, by Isao Sato 27% 22 
and Miyohei Shinohara (Japanese) 

Biological Science: An Ecological Approach, 18% 28 
Biological Sciences Curriculum Study 

The American Economy, by Roy J. 5% 132 
Sampson, et al. 

Sociology: The Study of Human 0% 3 
Relationships, by LaVerne Thomas 

and Robert Anderson 

New Ethics and Social Science, by 0% 5 
Yokichi Yajima, et al. (Japanese) 

Rise of the American Nation, by 0% 39 


Lewis Paul Todd and Merle Curti 


Magruder’s American Government, revised 0% 70 
by William McClenaghan 


86 GRAPHICAL PRACTICE 


Table 3 
Graphical Sophistication, Educational Tests 


Percentage of statistical 
graphics based on more 
than one variable, butnot Number of 


a time-series or a map graphics 
National university entrance examinations, 10096 16 
Japan, 1979 and 1980 
Review materials, Law School Admission 4896 29 
Test, United States 1975 
Standardized tests for grade school, high 
school, and college; United States, 1970s:* 
Science, 14 tests 6796 64 
Arithmetic, mathematics, algebra, and 4196 37 
analytic geometry; 21 tests 
Social studies, history, and 2496 49 
government; 14 tests 
General ability, 5 tests 2196 47 


* Graphics collected in James R. Beniger, compiler, Selected Standardized Test Items 
that Measure Ability with Graphics (Washington, D.C.: Bureau of Social Science 
Research, 1975). 


And so, just as there is a double standard of integrity at a good 
many news publications— one for words, another for graphics— 
so there is a double standard of sophistication. The statistical 
graphics are stupid; the prose is often serious and sometimes even 
demanding of expertise, as can be seen in these sentences from a 
single issue of the New York Times: 


Recycling petrodollars may postpone the day of reckoning, but its effects would 
soon become intolerable without a steady depreciation in their purchasing 
power. Floating rates of exchange cannot restore even a semblance of 
equilibrium. 


Numerous facets of the performance seem decidedly unfashionable if not 
downright eccentric: the square-toed instrumental phrasing and the frequent 
plodding tempos in the arias, the Romanticized treatment of the chorales, the 
generous retards at every cadence, the often intrusively elaborate continuo 
improvisations and an inconsistent attitude toward expression which ranges 
from heaving Mahlerian emphases to mechanical literalism. 


The Court shows no sign of retreating from its view that a state government 
is protected by sovereign immunity against court orders to pay retroactive 
damages for past violations. 


And Dr. Garth Graham, a medical director with Smitbkline Corp., makers of 
Thorazine, noted that neuroleptics produce no euphoria, and are therefore 
unlikely to be abused by patients with a history of drug or alcohol dependence. 
“They are, if anything, dysphorogenic,” Dr. Graham said. 


SOURCES OF INTEGRITY AND SOPHISTICATION 


Conclusion 


The conditions under which many data graphics are produced— 
the lack of substantive and quantitative skills of the illustrators, 
dislike of quantitative evidence, and contempt for the intelligence 
of the audience—guarantee graphic mediocrity. These conditions 
engender graphics that (1) lie; (2) employ only the simplest designs, 
often unstandardized time-series based on a small handful of 

data points; and (3) miss the real news actually in the data. 

It wastes the tremendous communicative power of graphics to 
use them merely to decorate a few numbers. Moreover, much of 
the world these days is observed and assessed quantitatively —and 
well-designed graphics are far more effective than words in 
showing such observations. 

How can graphic mediocrity be remedied? 

Surely there is something to be said for rejecting once and for 
all the doctrines that data graphics are for the unintelligent and 
that statistics are boring. These doctrines blame the victims (the 
audience and the data) rather than the perpetrators. 

Graphical competence demands three quite different skills: the 
substantive, statistical, and artistic. Yet now most graphical work, 
particularly at news publications, is under the direction of but a 
single expertise—the artistic. Allowing artist-illustrators to control 
the design and content of statistical graphics is almost like allowing 
typographers to control the content, style, and editing of prose. 
Substantive and quantitative expertise must also participate in the 
design of data graphics, at least if statistical integrity and graphical 
sophistication are to be achieved. 


87 


PART II 


Theory of Data Graphics 


Everyone spoke of an information overload, but what there was in fact 
was a non-information overload. 


Richard Saul Wurman, What-If, Could-Be (Philadelphia, 1976) 


4  Data-Ink and Graphical Redesign 


Data graphics should draw the viewer's attention to the sense and 
substance of the data, not to something else. The data graphical 
form should present the quantitative contents. Occasionally artful- 
ness of design makes a graphic worthy of the Museum of Modern 
Art, but essentially statistical graphics are instruments to help 
people reason about quantitative information. 

Playfair's very first charts devoted too much of their ink to 
graphical apparatus, with elaborate grid lines and detailed labels. This 
time-series, engraved in August 1785, is from the early pages of 
The Commercial and Political Atlas: 


CHABT of IMPORTS and EXPORTS of ENGLAND го and from at NORTHAMERICA 
From the kar 1770 to 1782 by W Гуй 


| 


EEG 
& SAN BELT 


i 


2,2 | 
—|2.Ј00,000 
mamis Z d 


[|| 


| 


T 


i" 


D a= [ — S == 
> = tees, 
1 2 SAPP à ا‎ Ü L7 27 240 AGG a 8 
Lhe Bottom Line is divided inte Tears the right-hand Line into HUNDRED THOUSAND POUNDS 
А Sep? ВДИ as де et directs 207 dug? ПИЗ. 


[Zz 2 423 


02 THEORY OF DATA GRAPHICS 


Within a year Playfair had eliminated much of the non-data 
detail in favor of cleaner design that focused attention on the 
time-series itself. He then began working with a new engraver 
and was soon producing clear and elegant displays: 


Exports and Imports to and fron DENMARK & NORWAY from yoo to 1780, 


1700 1710 1720 1750 1740 1750 1760 1770 
The Bottom line us divided гут Years, the Right hand tine into L10000 each. 
Published as tha Aer Hien, 10% Hay 1786, by WT Playfair Neede сыйр 352, Strand, Lender . 


This improvement in graphical design illustrates the fundamental 
principle of good statistical graphics: 


Above all else show the data. 


The principle is the basis for a theory of data graphics. 


Data-Ink 


A large share of ink on a graphic should present data-information, 
the ink changing as the data change. Data-ink is the non-erasable 
core of a graphic, the non-redundant ink arranged in response 

to variation in the numbers represented. Then, 


: . data-ink 
Data-ink ratio = 


total ink used to print the graphic 


= proportion of a graphic’s ink devoted to the 
non-redundant display of data~information 


= 1.0- proportion of a graphic that can be erased 
without loss of data-information. 


A few graphics use every drop of their ink to convey measured 
quantities. Nothing can be erased without losing information in 
these continuous eight tracks of an electroencephalogram. The data 
change from background activity to a series of polyspike bursts. 
Note the scale in the bottom block, lower right: 


EPOR systeme een ei Py Ais 
NUN M еоди ааа оа о qnit 
д? p^ det UU oy gg a i AAW and Ah ANS Apr 
"we Pointy ДАЛАА bay RO A ANIMA RN Sv, M in nn mar ea, 
wi TUR TAM Melle foe AP aides Melo Are AAA АДАА Ih pomis 
ТУЛИ иад PAS SYN y AN eA PAPA SP Ant IM 
КАЖ ЖАУУ Vel se P^ ENIPSI PS vat NISI Deer pP pe S Wy AVIV PN phi 


al ЖУЛУУ ЖУУЛУУ УЛ Л ЛЛ ЛЛУ eer 


поь ie ы Aet re ice Sof ЛҮҮ ҮН ltd ^ ANE И 
Каа А а genitis АША! 
ерун не ДЦ чав нсруи PSI Nia Ai Paty | J (a 
Minister з ММ лааны Pda АНЛАУ, а 
нонни notru e Pune e ЕКА ДД far adu e 

leg athl rtt Nt aeu Ps elt Pk tque P Де А Pet ed lila 
Meuse AP Pn A Un SPP eae Pi PST up Pi Pd 
анаан ae levier! tee ew tuto fA Mal AE 


A NND, PP s NNI „лум, PTL Me ne 

eut WA МУАА у ACM д 
А E t Sen P Pi аличу Ne Pam 
Jod ALS Rune PA yy 8 VA jan A sth 
Reel als, cl pos dg pA WAP WOM arih 
wA PPM s nnn Vay ena han fS 
MM Tu As Pet Pres 1 Nora s yan Aeg 
fv As Nai Pos js dea en PEST ы Г ЛЕ А-ЛАМ, 


DATA-INK 03 


Kenneth A. Kooi, Fundamentals of Elec- 
troencephalography (New York, 1971), 
р. 110. 


94 THEORY OF DATA GRAPHICS 


Most of the ink in this graphic is data-ink (the dots and labels 
on the diagonal), with perhaps 10-20 percent non-data-ink 
(the grid ticks and the frame): 


100M Sequoia 
Fir ¥ 
Whale ® — 
Kdpe * e Birch 
° 
10M Dog-wood e Balsam 
Rhino Ер hant 
Elk, H 
Decr 278" о 
m ® aSnake еМ" 


Foxe “Beaver 


Rat, 
ae Salamander: Horseshoe Crab 


x 10em Turtle 
E $ $ Frog 
. ; 
Е Bee, : one John Tyler Bonner, Size and Cycle: An 
5 m Horse Fly Essay on the Structure of Biology (Prince- 


* House Fly © Clam ton, 1965) i 
n 217: 
Daphnia ® Drosophila › 1905), p. 17 


Imm 


Stentor ө 
Parügiecium The length of an organism at the time of reproduc- 
е tion in relation to the generation time, plotted on a logarithmic 
100u Didinium scale. 


Tetrahymena 
e „Euglena 
e Spirochaeta 
10 
Е. Coli 
® a Pseudomonas 
. Aureus 


Ip 
1 hour lday 1 week I month 1 year 10 years 100 years 
GENERATION TIME 


In this display with nearly all its ink devoted to matters other 
than data, the grid sea overwhelms the numbers (the faint points 
scattered about the diagonal): 


Relationship of Actual Rates of Registration to Predicted Rates 
(104 citios 1960). 


oman ones 


fe В ДО ЕЦЕШ ЕНЕ АШЫ: 
me ES Woo w0 мо жо бо М9 PO мй о 50 400 фо MO 


Another published version of the same data drove the share of 
data-ink up to about 0.7, an improvement: 


PREDICTED 


Relationship of Actual Rates of Registration to Predicted Rates (104 cities 1960). 


But a third reprint publication of the same figure forgot to plot 
the points and simply retraced the grid lines from the original, 
including the excess strip of grid along the top and right margins. 
The resulting figure achieves a graphical absolute zero, a null data- 


ink ratio: 


Figure 19.1 Relationship of Actual Rates of Registration to Predicted Rates 
(104 cities, 1960) 


ACTUAL 


A 


Predicted 
Pi 
5 


- 


457 


+ 


Actual 


1 ا‎ 
300 330 409 450 WO 350 600 G50 700 750 600 850 90.0 950 100.0 


DATA-INK 95 


The three graphics were published in, 
respectively, Stanley Kelley, Jr., Richard 
E. Ayres, and William G. Bowen, “Reg- 
istration and Voting: Putting First 
Things First,” American Political Science 
Review, 61 (1967), 371; then reprinted 
in Edward R. Tufte, ed., The Quantita- 
tive Analysis of Social Problems (Reading, 
Mass., 1970), p. 267; and reprinted 
again in William J. Crotty, ed., Public 
Opinion and Politics: A Reader (New 
York, 1970), p. 364. 


96 THEORY OF DATA GRAPHICS 


Maximizing the Share of Data-ink 


- The larger the share of a graphic’s ink devoted to data, the better 
(other relevant matters being equal): 


Maximize the data-ink ratio, within reason. 


Every bit of ink on a graphic requires a reason. And nearly always 
that reason should be that the ink presents new information. 

The principle has a great many consequences for graphical editing 
and design. The principle makes good sense and generates reason- 
able graphical advice—for perhaps two-thirds of all statistical 
graphics. For the others, the ratio is ill-defined or is just not appro- 
priate. Most important, however, is that other principles bearing on 
graphical design follow from the idea of maximizing the share of 
data-ink. 


Two Erasing Principles 


The other side of increasing the proportion of data-ink is an 
erasing principle: 


Erase non-data-ink, within reason. 


Ink that fails to depict statistical information does not have much 
interest to the viewer of a graphic; in fact, sometimes such non- 
data-ink clutters up the data, as in the case of a thick mesh of grid 
lines. While it is true that this boring ink sometimes helps set the 
stage for the data action, it is surprising, as we shall see in Chapter 
7, how often the data themselves can serve as their own stage. 
Redundant data-ink depicts the same number over and over. The 


labeled, shaded bar of the bar chart, for example, 


unambiguously locates the altitude in six separate ways (any five 
of the six can be erased and the sixth will still indicate the height): 
as the (1) height of the left line, (2) height of shading, (3) height 
of right line, (4) position of top horizontal line, (5) position (not 
content) of number at bar's top, and (6) the number itself. That is 


120 


40 


20 


more ways than are needed. Gratuitous decoration and reinforce- 
ment of the data measures generate much redundant data-ink: 


DATA-INK 97 


1939-40 1943-44 1947-48 1951-52 1955-56 1959-60 1963.64 1967-68 1971-72 1975-76 


Bilateral symmetry of data measures also creates redundancy, 
as in the box plot, the open bar, and Chernoff faces: 


|| 


Half-faces carry the same information as full faces. Halves may 
be easier to sort (by matching the right half of an unsorted face 
to the left half of a sorted face) than full faces. Or else an 
asymmetrical full face can be used to report additional variables. 
Bilateral symmetry doubles the space consumed by the design 
in a graphic, without adding new information. The few studies 
done on the perception of symmetrical designs indicate that “when 
looking at a vase, for instance, a subject would examine one of its 
symmetric halves, glance at the other half and, seeing that it was 
identical, cease his explorations. . . . The enjoyment of symmetry 
. . . lies not with the physical properties of the figure. At least eye 
movements suggest anything but symmetry, balance, or rest.”? 


e í 


1 Bernhard Flury and Hans Riedwyl, 
"Graphical Representation of Multi- 
variate Data by Means of Asymmetrical 
Faces," Journal of the American Statistical 
Association, 76 (December 1981), 757- 
765. 


2Leonard Zusne, Visual Perception of 
Form (New York, 1970), pp. 256-257. 


98 THEORY OF DATA GRAPHICS 


Redundancy, upon occasion, has its uses: giving a context and 
order to complexity, facilitating comparisons over various parts of 
the data, perhaps creating an aesthetic balance. In cyclical time- 
series, for example, parts of the cycle should be repeated so that 
the eye can track any part of the cycle without having to jump 
back to the beginning. Such redundancy possibly improves Marey’s 
1880 train schedule. Those people leaving Paris or Lyon in the 
evening find that their trains run off the right-hand edge of the 
chart, to be picked up on the left again: 


PARIS, LL. e 1 UEM з 2 sy 5 а 
| W 
asil Wi 
Meret fl 
MONTEREAU LL; 
| | | 
d 
m bud ЧТИ 
TONNERRE 
[i ЦЧ 
1 Ў INI p 
darin D. 46 КИШ DM | V 
/ И 
i" 1 TNI ОИ n 
| M 
JR MT ү | (ШИ 
owon Ц ү. Ж] ШИП fH} ul ГИ aT 
| ҮТИ 
tayi | КАМ | п | 
Сайту aime ы 
tip LU TT TTN ШҮ N 
N 
MACON HHH He i ALL n IN. 
d H | 7 | 
H |) / LATIN 
== ШШ 
INON Perrache PN H | LHI ا‎ Lh ] | 
wan 6 т 8 * ry wo MID Y 1 3 [3 $ * T П [] 10 u WINUN ЖЕП g v * b 


Attaching an extra half cycle makes every train in the first 24 
hours of the schedule a continuous line (as would mounting the 
original on a cylinder): 


Paris... , 5 3 Lj n" мот ` 2 3 ` 5 6 7 4 a ı0 H__MIRUIY ot 2 3 > 5 7 5 LI 10 1t мр t 2 3 У 5 
( | í y "i 
p ШШ | Wilh hi | | | nt 
Moree | | j M IN LH 
MONTERCAULL= ҮҮ T Ту 1 Ht 
| | | p lint [И | N 
NW | Th | | N М T | li IN | 
Laroche 1 й |: H N NI H і 
TIT RT ШҮ |! il [ЇЇ Б 
TONNERRE |) 4 ИР, | LUX ia UN M 
АТОО ANN ШИШ MA 
|| | | | ТЩ MINIM || ! И И 
И n | Hs IN PW 1! 
“ШУКО (|! 
|| M n WU N ИУ " 
"n L 
Pra i Н I 
M I | N | 
MACON We J ка |) 1 | ү IN NI MIT 
| | || И hi | p IN ) 
icr май Mt fi P H 1 і mil 
ТСТ а Жана | юса иш Po re MEME И Г, эне Ta ^ ау 8 71. 78. 3 MX сто иш Коо? з 3 


And, similarly, instead of once around the world in this display 


of surface ocean currents, one and two-thirds times around is better: 


DATA-INK 99 


Kirk Bryan and Michael D. Cox, “The 
Circulation of the World Ocean; A 
Numerical Study. Part 1, A Homoge- 
neous Model,” Journal of Physical Ocean- 
ography, 2 (1972), 330. 


100 THEORY OF DATA GRAPHICS 


Most data representations, however, are of a single, uncomplicated 
number, and little graphical repetition is needed. Unless redundancy 
has a distinctly worthy purpose, the second erasing principle applies: 


Erase redundant data-ink, within reason. 


Application of the Principles in Editing and Redesign 


Just as a good editor of prose ruthlessly prunes out unnecessary words, 
so a designer of statistical graphics should prune out ink that fails 
to present fresh data-information. Although nothing can replace 
a good graphical idea applied to an interesting set of numbers, 
editing and revision are as essential to sound graphical design work 
as they are to writing. T. S. Eliot emphasized the "capital impor- 
tance of criticism in the work of creation itself. Probably, indeed, 
the larger part of the labour of an author in composing his work 
is critical labour; the labour of sifting, combining, constructing, 
expunging, correcting, testing: this frightful toil is as much critical 
as creative. ? 

Consider this display, which compares each long bar with the 
adjacent short bar to show the viewer that, under the various 
experimental conditions, the long bar is longer: 


post tea 


*'T. S. Eliot, “The Function of Criti- 
cism," in Selected Essays 1917-1932 
(New York, 1932), p. 18. 


james T. Kuznicki and N. Bruce Mc- 
Cutcheon, “Cross-Enhancement of the 
Sour Taste on Single Human Taste 
Papillae,” Journal of Experimental Psy- 
chology: General, 108 (1979),.76. 


Vigorous pruning improves the graphic immensely, while still 
retaining all the data of the original. It is remarkable that erasing 
alone can work such a transformation: 


12- 


ИЛ 


Ln L 
2 


32 32 22 


The horizontals indicate the paired comparisons and would change 
if the experimental design changed—so they count as information- 
carrying. All the asterisks are out since every paired comparison 
was statistically significant, a point that the caption can note. Here 
is the mix of non-data-ink and redundant data-ink that was erased, 
about 65 percent of the original: 


LA 
|| | 


DATA-INK 


101 


102 THEORY OF DATA GRAPHICS 


The data graphical arithmetic looks like this—the original design 
equals the erased part plus the good part: 


: a PCI g Abbe 
: lik ы 
$3 32 22 FP $31 


post tea Preas pentas protes pontes 


The next graphic, drawn by the distinguished science illustrator 
Roger Hayward, shows the periodicity of properties of chemical 
elements, exemplified by atomic volume as a function of atomic 
number. The data-ink ratio is less than 0.6, lowered because the 
76 data points and the reference curves are obscured by the 63 dark 
grid marks arrayed over the data plane like a precision marching 
band of 63 mosquitoes: 


70 


50 
سنو سس 
+ 


40 
+ 


20 


Atomic Volume 
30 
224-7770 


e 
Pa 


+ 
t 
Н 
1 
+ 
[ 
i 
| 
+ 
i 
i 
* 
1 
\ 
By 


10 
+ 


o 10 20 . 30 40 50 60 70 80 90 Linus Pauling, General Chemistry (San 
Atomic Number Francisco, 1947), p. 64. 


DATA-INK 103 


The grid ticks compete with the essential information of the graphic, 
the curves tracing out the periods and the empirical observations. 
The little grid marks and part of the frame can be safely erased, 
removed from the denominator of the data-ink ratio: 


© + + + + + 
+ + + + + 
3 + + + + + 
+ + + + + 
Е + + + + + 
+ + + + + 
© + - + + + 
10 30 50 


The uncluttered display brings out another aspect of the data: 
several of the elements do not fit the smooth theoretical curves 
all that well. The data-ink ratio has increased to about .9, with 
only the frame lines remaining as pure non-information: 


П 
F n 
1 
1 
1 
2L 1 
2] 1 П 
$ I 
| | ! 
IE | 1 [4 
П 1 ! t 
| | 
| | ! 
© 1 
v H | { l 
EY | l | 
3 | ЫШ 1 1 
o 1 | 
> FF i 1 1 Ф | 
! 
E i 1s П 5 1 1 l 
E i #1 ? ү H \ ә П ‘ 
oo \ Foy boy е ue H H 
pel A i / abr \ ] 4 
ї Н \ . eo. СА 
< | С э od 1 LJ \ 
1 iw. 4 ‚ H D or \ 
коку а oe. wi 
ME EE v S oe Senn ` 
о, eas ve. 
эу 
© а 1 — —————— — —— 
о 20 40 60 80 


Atomic Number 


104 THEORY OF DATA GRAPHICS 


The reference curves prove essential for organizing the data to 
show the periodicity. The curves create a structure, giving an 
ordering, a hierarchy, to the flow of information from the page: 


1 


Bt 


50 
ыз 
- 


40 


Atomic Volume 
20 30 
T T 
$ 
е 
5 . 


10 
T 
P 
, 
А 
i 
. 


Atomic Number 


Restoring the grid fails to organize the data. The ticks are too 
powerful, and they also add a disconcerting visual vibration to the 
graphic. With the ticks, the reference curves become all the more 
necessary, since the eye needs some guidance through the maze of 
dots and crosses: 


g * + + + 
e 
a- + + + + 
. 
wal + + + + 
E 
2 . 
Qo 
> + + + + 
t Ң g А 
ose + + Bi Ав 
< * D 
. of te s 
Sr +e +P Ф A 
MT 
M 
o LL ts و ا‎ aaam 


0 10 20 30 40 
Atomic Number 


The space opened up by erasing can be effectively used. Labels for 
the initial elements of each period, an alkali, show the beginning 
of each cycle in the periodic table of elements—and in the graphic. 
The unusual rare-earths are indicated. In addition, the label and 
numbers on the vertical axis are turned to read from left to right 


rather than bottom to top, making the graphic slightly more 
accessible, a little more friendly: 


* Cs 
| 55 
1 
| 
ol | 
е Rb | 
1 37 1 
Ё | 1 Tre 
1 t 571 
eK ! l 1 
119 Ц 1 1 
40 і 1 | i 
! 1 
! | n | 
Atomic | , i | 
Volume 1 1 i * | 
Na | b m fy 1 
fie | H i of е П | 
20 mu Ha ; че E 
Di ! ? \ H Mar y \ 
Li lu ! y. 8 od the rare ! Bi \ 
| {% U? 4 ы ^ "m earths ® / b 
a " \ 
"E М. 2 > rd S 
1 f eget 
bee 
$ 
20 40 60 80 
Atomic Number 
Conclusion 


Five principles in the theory of data graphics produce substantial 
changes in graphical design. The principles apply to many graphics 


and yield a series of design options through cycles of graphical 
revision and editing. 


Above all else show the data. 
Maximize the data-ink ratio. 
Erase non-data-ink. 
Erase redundant data-ink. 


Revise and edit. 


DATA-INK 


105 


With savage pictures fill their gaps 
And o'er unhabitable downs 
Place elephants for want of towns. 


Jonathan Swift’s indictment of 17th-century cartographers 


5 Chartjunk: Vibrations, Grids, and Ducks 


The interior decoration of graphics generates a lot of ink that does 
not tell the viewer anything new. The purpose of decoration varies 
—to make the graphic appear more scientific and precise, to enliven 
the display, to give the designer an opportunity to exercise artistic 
skills. Regardless of its cause, it is all non-data-ink or redundant 
data-ink, and it is often chartjunk. Graphical decoration, which 
prospers in technical publications as well as in commercial and 
media graphics, comes cheaper than the hard work required to 
produce intriguing numbers and secure evidence. 

Sometimes the decoration is thought to reflect the artist’s fun- 
damental design contribution, capturing the essential spirit of the 
data and so on. Thus principles of artistic integrity and creativity 
are invoked to defend—even to advance—the cause of chartjunk. 
There are better ways to portray spirits and essences than to get 
them all tangled up with statistical graphics. 

Fortunately most chartjunk does not involve artistic considera- 
tions. It is simply conventional graphical paraphernalia routinely 
added to every display that passes by: over-busy grid lines and 
excess ticks, redundant representations of the simplest data, the 
debris of computer plotting, and many of the devices generating 
design variation. 

Like weeds, many varicties of chartjunk flourish. Here three 
widespread types found in scientific and technical research 
work are catalogued—unintentional optical art, the dreaded grid, 
and the self-promoting graphical duck. A hundred chartjunky 
examples from commercial and media graphics have been forgone 
so as to demonstrate the relevance of the critique to the profes- 
sional scientific production of data graphics. 


Unintentional Optical Art 


Contemporary optical art relies on moiré effects, in which the 
design interacts with the physiological tremor of the eye to pro- 
duce the distracting appearance of vibration and movement. 


108 THEORY OF DATA GRAPHICS 


OAC 


The effect extends beyond the ink of the design to the whole page. 
When exploited by the experts, such as Bridget Riley and Victor 
Vasarely, op art effects are undoubtedly eye-catching. 
But statistical graphics are also often drawn up so as to shimmer. 
This moiré vibration, probably the most common form of graph- 
ical clutter, is inevitably bad art and bad data graphics. The noise 
] d d f fi f ti th 1 f t hni 1 d Instituto de Expansão Commercial, 
clouds the flow of information as these examples from technical an Brasil: Graphicos Economicos-Estatisticas 
scientific publications illustrate: (Rio de Janeiro, 1929), p. 15. 


TECIDOS DE ALGODAO 


DEA 
( COTONNADES ) ( COTTON TEXTILES ) 


ПЕС: 


PRODUCGAO DETECIDOS 
(PRODUCTIOND£ 715505) ( TEXTILE PRODUCTION) 
AAR THOS 


Ke) BN 


FABRICAS DE TECIDOS 
(FABRIQUES oe TISSUS] 

( COTTON FACTORIES ) 

EEG 


paiva ois А 


CONTOS ок ату 


CAPTIAES DAS FABRICAS 


Ч АХ УМРЯ) ETI) 
iretan СРНА) сч 


СОЯ 


2388 tjt HERI 
iiri$tibirfiidg 


CONTOS ee END, 


IMPOSTOS oe CONSUMO TECIDOS 


2 A 


( IMPORTATION) | 
IMPORTACAO 
= [ШЇ 


CHARTJUNK 109 


2.0 


1.5 1.5 


1.0 


0.5 


Severity of Aortic Regurgitation 


LT 


AM >> 


Months after Operation 


Figure 2. Serial Echocardiographic Assessments of the Severity of Regurgitation in the Pulmonary Autograft in 31 Patients. 


The numerical grades were assigned according to the severity of regurgitation, as follows: 0, none; 0.5, trivial; 1.0 to 1.5, mild; 2.0, 
moderate; and 3.0, severe. 


On this page, what should have been simple tables are turned into Nicholas T. Kouchoukos, et al., 
hics published i j ientific j Т uc “Replacement of the Aortic Root with 
bad graphics published in major scientific journals. Above a duck Kc ONE R 


moiré with an unintentional Necker Illusion, as the two back planes Young Adults with Aortic-Valve Disease,” 
optically flip to the front. Some pyramids conceal others; and one The New England Journal of Medicine, 330 
variable (stacked depth of the stupid pyramids) has no label or scale. (January 6, 1994), p- 4- 

Below, we learn very little about data, but do discover that moiré 

vibration may well be at a maximum for equally spaced bars: 


James T. Kuznicki and N. Bruce Mc- 
Cutcheon, '*Cross-Enhancement of the 
Sour Taste on Single Human Taste 
Papillae,” Journal of Experimental Psy- 
chology: General, 108 (1979), 76. 


Eain M. Cornford and Marie E. Huot, 

“Glucose Transfer from Male to Female 
Schistosomes," Science, 213 (September 
11, 1981), 1270. 


['^c]gtucose in S. mansoni (ТШ) 


110 THEORY OF DATA GRAPHICS 


And, finally, from the style sheet once provided by the Journal 
of the American Statistical Association, a graphic described as 
“an example of a figure prepared in the proper form”: 
A. Average Probabilities of W from N(1,1) 
with n — 10 
AVERAGE PROBABILITY 


0.15 .] 


0.10 


0.05. 


The display required 131 line-strokes and 15 digits to communicate 
its simple information. The vibrating lines are poorly drawn, 
unevenly spaced, and misaligned with the vertical axis. 


Vibrating chartjunk even frequents the graphics of major 
scientific journals: 


The ten most frequently cited Percentage of Number of 
(footnoted) scientific journals: random graphics with graphics 
sample of issues published 1980-1982 moiré vibration in sample 
Biochemistry 2% 568 
Journal of Biological Chemistry 2% 565 
Journal of the American Chemical Society 3% 317 
Journal of Chemical Physics 6% | 327 
Biochimica et Biophysica Acta 8% 432 
Nature 11% 225 
Proceedings of the National Academy of 12% 438 
Sciences, U.S.A. 

Lancet 15% 364 
Science 17% 311 


New England Journal of Medicine 21% 338 


“JASA Style Sheet,” Journal of the Amer- 
ican Statistical Association, 71 (March 
1976), 260-261. 


CHARTJUNK 111 


Moiré effects have proliferated with computer graphics (in 
programs such as Excel). Such unfortunate patterns were once 
generated by means of thin plastic transfer sheets; now the 
computer produces instant chartjunk. Shown here are a few of 
the many vibrating possibilities. Cross-hatching should be replaced 
with tint screens of shades of gray. Specific areas on a graphic 
should be labeled with words rather than encoded with hatching. 


ФФ ооеоооееооее 

€060900000 осооооеоеооое ....:... э >» 6€ — ctos 
99090909090 cecccssocc 5-5 «* IOI Н eovecccese 
9690900000004 = =—« c*vececcccc 

€90096000900€0 осоосоосовев Eco": »9 aA“ — c oecsconsuos 


Б оссе pone WS BEEN i 
ИШИНИН ИШИН ЇШШШШШ ПШ RR RR ШШШ 
ll WY 2222 ШШ UNI I 
с ДЕ ЖЕУ Mal 
FEE NNNNNNNO 0999999 ӘБ MRA QQ 


This form of chartjunk is a twentieth-century innovation, and 
computer graphics are multiplying it more than ever. The handbooks 
and textbooks of statistical graphics, along with user’s manuals 
for computer graphics programs, are filled up with vibrating 
graphics, presented as exemplars of design. Note the high 


112 THEORY OF DATA GRAPHICS 


proportion of chartjunky graphics in the more recent publications. 
Computer graphics are particularly active: 


Textbooks and handbooks of statistical graphics; Percentage of 


and manuals for computer graphics programs graphics with Total number 
(ordered by date of publication) moiré vibration of graphics 
Willard C. Brinton, Graphic Methods for 1296 255 
Presenting Facts (New York, 1914) 

R. Satet, Les Graphiques (Paris, 1932) 2996 28 
Herbert Arkin and Raymond R. Colton, Graphs: 17% 95 
How to Make and Use Then: (New York, 1936) 

Mary Eleanor Spear, Charting Statistics 46% 134 
(New York, 1952) 

Anna C. Rogers, Graphic Charts Handbook 32% 201 
(Washington, D.C., 1961) 

F, J. Monkhouse and H. R. Wilkinson, Maps 14% 322 
and Diagrams (London, third edition, 1971) 

Calvin F. Schmid and Stanton E. Schmid, 2296 399 


Handbook of Graphic Presentation (New York, 
second edition, 1979) 


A. J. MacGregor, Graphics Simplified 3496 65 
(Toronto, 1979) 
The user's manual for a widely distributed 6896 28 


computer graphics package: SAS/GRAPH User's 
Guide (Cary, North Carolina, 1980) 


The manual for a very extensive computer 5396 459 
graphics program: Tell-A-Graf User's Manual 
(San Diego, 1981) 


Can optical art effects ever produce a better graphic? Bertin 
exhorts: “It is the designer's duty to make the most of this variation; 
to obtain the resonance [of moiré vibration] without provoking 
an uncomfortable sensation: to flirt with ambiguity without 


succumbing to it.”! But can statistical graphics successfully 1 Jacques Bertin, Semiology of Graphics: 
"flirt with ambiguity"? It is a clever idea, but no good examples Diagrams, Networks; Maps (Madison; 

E » р . Wisconsin, 1983, translated by William J. 
are to be found. The key difficulty remains: moiré vibration Berg), p. 80; this book is the English 
is an undisciplined ambiguity, with an illusive, eye-straining quality translation of Bertin’s important work, 


А А г i iologi hi is, 1967). 
that contaminates the entire graphic. It has no place in data Sémiologie graphique (Paris, 1967) 


graphical design. 


The Grid 


One of the more sedate graphical elements, the grid should usually 
be muted or completely suppressed so that its presence is only 
implicit—lest it compete with the data. Grids are mostly for the 
initial plotting of data at home or office rather than for putting 


CHARTJUNK 113 


into print. Dark grid lines are chartjunk. They carry no informa- 
tion, clutter up the graphic, and generate graphic activity unrelated 
to data information. This grid camouflages the profile of the data in 
the age-sex pyramid of the population of France in 1967: 


Population of France, by Age and Sex: January 1, 1967 


YEAR OF BIRTH AGE YEAR OF BIRTH 
1866 100 1866 (а) Military losses in World War | 
(b) Deficit of births during World War I 
95 FEMALES (c) Military losses in World War I1 
-1875 (d) Deficit of births during World War II 
1876 90 T (e) Rise of births due to demobilization after World War 11 
85 
1886 80 - 1886 
75 
1896 70 -1896 
| М: 
1906 60E 31906 
55 E 
E J 
1916 50 E lo 1916 
jasE 
1926 1a0 =} 1926 
3 35 
1936 330 41936 
Е 23 d) 
1946 F 20 1948 
) el 
15 
1956 110 1956 
5 
1966 0 E 1966 
500 400 300 200 100 0 0 100 200 300 400 500 


POPULATION IN THOUSANDS 


A revision quiets the grid and gives emphasis to the data: 


ттттүтт 


з Е 
E E T- 
LH s ES 
: | I 
| i E Based on data in Institut National de Ја 
| Br HE Statistique et des Études Economiques, 
= МЕ В 3 E T Annuaire statistique de la France, 1968 
Е (Paris, 1968), рр. 32-33; redrawn іп 
E | Henry S. Shryock and Jacob S. Siegel, 
E | The Methods and Materials of Demography 
E (Washington, D.C., 1973), vol. 1, 242. 


114 THEORY OF DATA GRAPHICS 


The space occupied by the doubled grid lines consumes 18 per- Paul A. Tukey and John W. Tukey, 


cent of the area of this otherwise most ingenious design, a “multi- 


"Data-Driven View Selection; Agglom- 
eration and Sharpening,” in Vic Barnett, 


window plot." Optical white dots appear at the intersections of ed., Interpreting Multivariate Data (Chi- 
the grid lines. (The plot shows the following: The large square chester, England, 1981), 231-232. 
contains X,, X, scatterplots for the indicated levels of X, and Xs. 

The marginal plots on the right are conditioned on X, and the 

plots at the top on X,. The upper right corner shows the uncon- 

ditional X,, X, scatter.) Redrawing eliminates the vibration: 


ULTIWINDOW PLOT OF PARTICLE PHYSICS MOMENTUM DATA 


Cw | н) 


^ did ww 


CHARTJUNK 115 


The grid in the classic Marey train schedule is very active: 


PARIS... S ы 8 go no ню т 2 3 » 5 6 7 ê 9 10 u miU. ot 2 3 [S 5 8 


Moree 
MONTEREAU 


| МИ 
Laroche | : H H- HH n | ln | 


TONNERRE | || ME | J] | TN JN | N 
| | H | | 
Mature dosi 1 TITIA HTT Г] +h | Г НН! T FH 


| 
Ler dimes i ү | ИШ MM L+. Lit j 
| К 
DIJON | ' | A 
| TN 
DX 
Chagny L| d | h H Ц | 
Colomina | 
| 
za 
MACON i HU 
T t 
| FERT 
ea т 
LYON Perrache 
MID 6 T @ Y- 36 woo Wi Cd 1 3 [] 5 8 1 8 9 10 NUIT 1 2 à 3 s э 
Thinning the grid lines helps a little bit: 
PARIS... 1 & a 10 n MILO! Ц 2 3 ^ 5 6 ? 8 9 10 п MINUIT 0 2 3 & 5 б PARIS... 
7 7 7 Т au IPA " Г T 
dh ОТШ ө АШИР ОТТЫ | | 
TR RRR PN Ut TNT) MORE SNL (ШШ 
TINI] | AE EU | $ NY dn 
Morez Ll | ШТ, 1 J AM: | | LL LÍ || i | ТИ 1L |! 14 lll Д Moree 
monrereau LHI ONI П 11 T Ц LLLI d | IRU Ll ЦИП. $ MONTEREAU 
7 H T TN т р NT i) Т 1 1 | 
| | N, , HI M i h 4 | | | 
iJ | | | f | | |! | | 
{|| N | | | БО | | NL МИМ ULL TG SEEDS | WI 
| АШ h n | |} Ў 1 { 1 М || ү! ГМ, |l ||| | Laroche 
Laroche [e TE SEEDER ERI Mp cH se n HW اا‎ H 4 fr aro 
И] ih | | T | | l | |! | | | | 
ШШЩ LATTA DET MUN ТУИТ ML romam 
TONNERRE fi H PAG DN | MLL | , p | Lif ТИТ NNERRE 
| HI | MI {| bd Ў } THES T 
en! LL EE 22 d LE Sa HH ad Nap a ۳ HNH ШАШ oH LHL: ML Muze oie 
HILL n PT NP | | | ГЫ | | {|| 
M ui | | | | | | | | |!) 
ПРАД H B LLL | T | || | : HHL LLL c nn | Hy. " "n Lar laumes 
| ) | | | i | | | | Ц ИИ || 
Ш ТИТА И ШАД | ШИШ! 
GIJON 7 WT "ГД Ч Wn М | ( j] | | Q TIT | il y tul | И 17] i d 
Hl Nd N. | NUI | | ШШ! 
| n ! T | hl TP n ID | ИҢ! 
NI д 
chagny abb ШИИ 4 ШИ H H AH R И 7 елет 
Фадолеч-айта АР ШИ H + Att CARI 4 HH T / | | CEL d atate tre 
Vl it! | ! FIN 1 ! | И! | | 
| IP. | BU | Il "NUI W P | ) D | | 
i | | T | (“ЧИМИ | | nm | 
M | TN Hi Ip | | | АІ \| 
| N | | ПП И] | | || LLLI M. | 
MACON VD RN EO il em mum ail HH Paj | | MACON 
ral ИТТИ ТШТУ | NES 
| |!!! | | | | |The H | MAD / | ПИ | | 
"epp n | | ы n j H h mpm ВУ НТА ао 
E (| | И | | d ыгы | | Lilo | LYON P 
giri a т 8 9 » Ш MIDI [1 ? 3 ¥ + a ? 8 9 то ип INVIT 1 1 a Ы 5 6 aK” 


116 THEORY OF DATA GRAPHICS 


A better treatment, however, is a gray grid: 


a 7 ° 10 "s MINUIT. O) 6 
PARIS......8 Ц 8 9% 10 no MIDI 1 2 3 ^ 5 $ 8 E PARIS 
Moret. i Horet 
MONTEREAU | р | с^ MONTEREAU 
Larocha. &aroche- 
TONNERRE TONNERRE 
Fidi Anni тй r 
Lerlaumes Les laumes 
DIJON DIJON 
Chagny баулу 
Chalenaro Cia. Chalonns-time 
MACON MACON 

| 
RI a 
| LYON Perrache 
NO 6 1 8 9 w " MIDI [| 1 3 * 5 4 7 8 9 lo пи MINT 1 $^ (SKIS 


When a graphic serves as a look-up table, then a grid may help 
in reading and interpolating. But even in this case the grids should 
be muted relative to the data. A gray grid works well and, with a 
delicate line, may promote more accurate data reconstruction 
than a dark grid. 

Most ready-made graph paper comes with a darkly printed grid. 
The reverse (unprinted) side should be used, for then the lines 
show through faintly and do not clutter the data. If the paper is 
heavily gridded on both sides, throw it out. 


Self-Promoting Graphics: The Duck 


When a graphic is taken over by decorative forms or computer 
debris, when the data measures and structures become Design 
Elements, when the overall design purveys Graphical Style rather 
than quantitative information, then that graphic may be called a 
duck in honor of the duck-form store, “Big Duck.” For this building 
the whole structure is itself decoration, just as in the duck data 
graphic. In Learning from Las Vegas, Robert Venturi, Denise Scott 


Brown, and Steven Izenour write about the ducks of modern 
architecture—and their thoughts are relevant to the design of data 
graphics as well: 


When Modern architects righteously abandoned ornament on 
buildings, they unconsciously designed buildings that were 
ornament. In promoting Space and Articulation over sym- 
bolism and ornament, they distorted the whole building into 
a duck. They substituted for the innocent and inexpensive 
practice of applied decoration on a conventional shed the 
rather cynical and expensive distortion of program and struc- 
ture to promote a duck. . . . It is now time to reevaluate the 
once-horrifying statement of John Ruskin that architecture is 
the decoration of construction, but we should append the 
warning of Pugin: It is all right to decorate construction but 
never construct decoration.? 


CHARTJUNK 117 


2Robert Venturi, Denise Scott Brown, 
and Steven Izenour, Learning from Las 
Vegas (Cambridge, revised edition, 
1977), p. 163. The initial statement of 
the duck concept is found on pp. 87-103. 


Big Duck, Flanders, New York; photo- 
graph by Edward Tufte, July 2000. 


118 THEORY OF DATA GRAPHICS 


The addition of a fake perspective to the data structure clutters 
many graphics. This variety of chartjunk, now at high fashion in 
the world of Boutique Data Graphics, abounds in corporate annual 
reports, the phony statistical studies presented in advertisements, 
the mass media, and the more muddled sorts of social science 
research. 

A series of weird three-dimensional displays appearing in the 
magazine American Education in the 1970s delighted connoisseurs 
of the graphically preposterous. Here five colors report, almost by 
happenstance, only five pieces of data (since the division within 
each year adds to 100 percent). This may well be the worst graphic 
ever to find its way into print: 


Percent of 
total enrollment 7 


72 


AGE STRUCTURE OF COLLEGE ENROLLMENT 


66 UNDER 25 


25 AND OVER 


1972 1973 1974 1975 1976 


CHARTJUNK 119 


There are some superbly produced ducks: William L. Kahrl, et al., The California 
Water Atlas (Sacramento, 1978, 1979), 


Р. 55. 


Applied Irrigation Water 
1972 


Crop Types 
Pasture _ ll Macotancous Field EE Méccotiancous Truck | 
В Meadow Pasture [Јање 8 sugar Beets 
В леа г cotton Bl гое 
(isan [El Deciduous oreharc г Grapes 
[EX 


D Each block represents 5,000 acre-leot of water applied to that crop type 


707,000 ^ Number represents the total acre-feet of applied water in that Hydrologic Basin area 


Veet Ae рў 
Colorado Desert 


Sos Nuda 
9. sores | 
پڪ‎ | 
H омота 


120 THEORY OF DATA GRAPHICS 


Occasionally designers seem to seek credit merely for possessing 
a new technology, rather than using it to make better designs. 
Computers and their affiliated apparatus can do powerful things 
graphically, in part by turning out the hundreds of plots necessary 
for good data analysis. But at least a few computer graphics only 
evoke the response “Isn’t it remarkable that the computer can be 
programmed to draw like that?" instead of "My, what interesting 


data.” 
во 
55 : 
MN INFLATION (N=415) 
z us UNEMPLOYMENT (не 1002 
wo SHORTAGES (N=68) 
= 
ues RACE (N«103) 
= 30 [Z] crime (N=123) 
e 25 GOVT. POWER (N«1SU) 
20 , 
La CONFIDENCE (N=268) 
© 15 А 
€ io WATERGATE (N=S37) 
e. 


COMPETENCE (N»322) 


UNEM RACE  GOVTPON WATERG 
ISSUE AREAS 


The symptoms of the We-Used-A-Computer-To-Build-A-Duck Arthur Н. Miller, Edie М. Goldenberg, 
Syndrome appear in this display from a professional journal: the SNR: е ipid 
thin substance; the clotted, crinkly lettering all in upper-case sans fidence,” American Political Science 
serif; the pointlessly ordered cross-hatching; the labels written Review, 73 (1979), 67-84. 
in computer abbreviations; the optical vibration—all these the 
by-products of the technology of graphic fabrication. The overly 
busy vertical scaling shows more percentage markers and labels 
than there are actual data points. The observed values of the 
percentages should be printed instead. Since the information con- 
sists of a few numbers and a good many words, it is best to pass 
up the computerized graphics capability this time and tell the 
story with a table: 


CHARTJUNK 


Content and tone of front-page Percent of articles with 
articles in 94 U.S. newspapers, Number negative criticism of 
October and November, 1974 of articles specific person or policy 
‘Watergate: defendants and prosecutors, 537 49% 

Ford’s pardon of Nixon 

Inflation, high cost of living 415 28% 
Government competence: costs, quality, 322 30% 


salaries of public employees 


Confidence in government: power of 266 52% 
special interests, trust in political 
leaders, dishonesty in politics 


Government power: regulation of business, 154 42% 
secrecy, control of CIA and FBI 

Crime 123 30% 
Race 103 25% 
Unemployment 100 13% 
Shortages: energy, food 68 16% 
Conclusion 


Chartjunk does not achieve the goals of its propagators. The 
overwhelming fact of data graphics is that they stand or fall on 
their content, gracefully displayed. Graphics do not become 
attractive and interesting through the addition of ornamental 
hatching and false perspective to a few bars. Chartjunk can turn 
bores into disasters, but it can never rescue a thin data set. The 
best designs (for example, Minard on Napoleon in Russia, Marey’s 
graphical train schedule, the cancer maps, the Times weather his- 
tory of New York City, the chronicle of the annual adventures 
of the Japanese beetle, the new view of the galaxies) are intriguing 
and curiosity-provoking, drawing the viewer into the wonder of the 
data, sometimes by narrative power, sometimes by immense detail, 
and sometimes by elegant presentation of simple but interesting 
data. But no information, no sense of discovery, no wonder, no 
substance is generated by chartjunk. 


Forgo chartjunk, including 
moiré vibration, 


the grid, and the duck. 


121 


Painting is special, separate, a matter of meditation and contemplation, 
for me, no physical action or social sport. As much consciousness as 
possible. Clarity, completeness, quintessence, quiet. No noise, no schmutz, 
no schmerz, no fauve schwármerei. Perfection, passiveness, consonance, 
consummateness. No palpitations, no gesticulation, no grotesquerie. 
Spirituality, serenity, absoluteness, coherence. No automatism, 

no accident, no anxiety, no catharsis, no chance. Detachment, 
disinterestedness, thoughtfulness, transcendence. No humbugging, 

no button-holing, no exploitation, no mixing things up. 


Ad Reinhardt, statement for the catalogue of the exhibition, “The New Decade: 
35 American Painters and Sculptors,” Whitney Museum of American Art, 
New York, 1955. 


6 Data-Ink Maximization and Graphical Design 


So far the principles of maximizing data-ink and erasing have 
helped to generate a series of choices in the process of graphical 
revision. This is an important result, but can the ideas reach. be- 
yond the details and particularities of editing? Is it possible to do 
what a theory of graphics is supposed to do, that is, to derive new 
graphical forms? In this chapter the principles are applied to many 
graphical designs, basic and advanced, including box plots, bar 
charts, histograms, and scatterplots. New designs result. 


Redesign of the Box Plot 


Mary Eleanor Spear's “range bar" 


== Range from lowest to highest amount ıl 
a Median 
ZA 


H Interquartile Range» 


maximum 
and John Tukey's "box plot" quartile 


median 


quartile 


Mary Eleanor Spear, Charting Statistics 

(New York, 1952), p. 166; and John W. 

Tukey, Exploratory Data Analysis (Reading, 
minimum Massachusetts, 1977). 


124 THEORY OF DATA GRAPHICS 


can be mostly erased without loss of information: 


The revised design, a quartile plot, shows the same five numbers. It 
is easy to draw by hand or computer and, most importantly, can 

replace the conventional scatterplot frame. The straightedge need 

only be placed on the paper once to draw the quartile plot, com- 

pared to six separate placings for the box plot. An alternative is 


but this design will not work effectively to frame a scatterplot. 
Nor does it look very good. 

Perhaps special emphasis should be given to the middle half of 
the distribution, however, as in the box plot. This can be done by 
changing line weights 


or, even better, by offsetting the middle half: 


This latter design is the preferred form of the quartile plot. It uses 
the ink effectively and looks good. 


In these revisions of the box plot, the principle of maximizing 
data-ink has suggested a variety of designs, but the choice of the 
best overall arrangement naturally also rests on statistical and 
aesthetic criteria—in other words, the procedure is one of reasonable 
data-ink maximizing. 


DATA-INK MAXIMIZATION 125 


The same logic applies to many similar designs, such as this 
“parallel schematic plot.” The original required 80 separate plac- 
ings of the straightedge, 50 horizontals and 30 verticals: 


mm re Sa | 


ee 


( 
4 
1 
e ede 
-------4 


П -т- 


П 
ES 
1 
jm 
! 
f 
' 


An erased version requires only 10 verticals to show the same 
information: 


The large reduction in the amount of drawing is relevant for the 
use of such designs in informal, exploratory data analysis, where 
the research worker's time should be devoted to matters other 
than drawing lines. 


126 THEORY OF DATA GRAPHICS 


Redesign of the Bar Chart /Histogram 


Here is the standard model bar chart, with the design endorsed by 
the practices and the style sheets of many statistical and scientific 
publications: 


Its architecture differs little from Playfair’s original design: 


Exports and. Imports of SCOTLAND to and from different parts for one Year from Chriftmas i80 to Chrittmas 171 


зо 90 зо до 5o бо yo 60 p оа 


Amer of Places, 
р кеа ха 1 


р. _Jersey Ke. 


stig вс 
i 


"T5 


[ 
m 


E | era ж | 
ИЕА 2 Norway Шы 
i : Flanders 

WP Ladies 


The Cpright divifiens are Ten Thanfard Pounds cach. The Black Lines are Aporte the Ribbed lines imperts 


Одна ас (бе riot directs fume z^ pae be WO" Fler Mok ооф 982 Strand. Lander. 


The box can be erased: 


And the vertical axis, except for the ticks: 


Even part of the data measures can be erased, making a white 
grid, which shows the coordinate lines more precisely than ticks 
alone: 


DATA-INK MAXIMIZATION 


127 


128 THEORY OF DATA GRAPHICS 


The white grid eliminates the tick marks, since the numerical labels 
on the vertical are tied directly to the white lines: 


15% 


Although the intersection of the thicker bar with the thinner base- 
line creates an attractive visual effect (but also the optical illusion 
of gray dots at the intersections), the baseline can be erased since 
the bars define the end-point at the bottom: 


Still, a thin baseline looks good: 


Erasing and data-ink maximizing have induced changes in the 
plain old bar chart. The techniques—no frame, no vertical axis, 


no ticks, and the white grid—apply to other designs: 


1 Month's Telephone Bill ($) 


Variable Width Notched Box Plot 


109 


ag. 


— IT To 2 Sto $ 6 to 10 IT te 15 over [5 


80 


20 


10 


Years Lived in Chicago 


Telephone Bill vs Years Lived in Chicago 
Non- overlapping of Notches Indicale 
Significant Difference at Rough 95% Level 
Width of Box Proportional] to Root Group Size 
NOTE - Y Axis Seale is Logarithmic 


DATA-INK MAXIMIZATION 129 


Robert McGill, John W. Tukey, and 
Wayne A. Larsen, “Variations of Box 
Plots,” American Statistician, 32 (1978), 
12-16. 


130 THEORY OF DATA GRAPHICS 


Redesign of the Scatterplot 


Consider the standard bivariate scatterplot: 


A useful fact, brought to notice by the maximization and erasing 
principles, is that the frame of a graphic can become an effective 
data-communicating element simply by erasing part of it. The 
frame lines should extend only to the measured limits of the data 
rather than, as is customary, to some arbitrary point like the next 
round number marking off the grid and grid ticks of the plot. 
That part of the frame exceeding the limits of the observed data 
is trimmed off: 


The result, a range-frame, explicitly shows the maximum and min- 
imum of both variables plotted (along with the range), information 
available only by extrapolation and visual estimation in the con- 
ventional design. The data-ink ratio has increased: some non-data- 
ink has been erased, and the remainder of the frame, now carrying 
information, has gone over to the side of data-ink. 


DATA-INK MAXIMIZATION 131 


min Xj max Xj 


== т ^ T т > Conventional Scatterplot 


T t Range-Frame‏ 7 — چ چ 


A range-frame does not require any viewing or decoding in- 
structions; it is not a graphical puzzle and most viewers can easily 
tell what is going on. Since it is more informative about the data 
in a clear and precise manner, the range-frame should replace the 
non-data-bearing frame in many graphical applications. 


10.0 


132 THEORY OF DATA GRAPHICS 


A small shift in the remaining ink turns each range-frame into 
a quartile plot: 


Erasing and editing has led to the display of ten extra numbers 
(the minimum, maximum, two quartiles, and the median for both 
variables). The design is useful for analytical and exploratory data 
analysis, as well as for published graphics where summary char- 
acterizations of the marginal distributions have interest. The design 
is nearly always better than the conventionally framed scatterplot. 


Range-frames can also present ranges along a single dimension. 
Here the historical high and low are shown in the vertical frame. 
This is an excellent practice and should be used widely in all sorts 
of displays, both scientific and unscientific: 


- 7 10.0 


DATA-INK MAXIMIZATION 133 


Finally, the entire frame can be turned into data by framing the 1 The terminology follows tradition, for 


bivariate scatter with the marginal distribution of each variable. scatterplots were once called “dot dia- 
i grams’’—for example, іп R. A. Fisher’s 
The dot-dash-plot results. Statistical Methods for Research Workers 


(Edinburgh, 1925). 


= . . 
с=з о ®% 
A . 
HE H 
m . . * 
= H 
- Ф 
ae . 
= (ху) ME 
RE E ae #8 
= | . 
" | 
Ре m 
= | gee 
= le . " 
d [ А 
= | id . 
- le 
= 1 e А 
= EE. . 
= | ш * 
um . 
= ы . 
= * ‘x | 
= * 5 I е 
| 
| 
[ы ЕП ts IEI pg MoE UTE dP Т |1 | LE 3 
Xi 


The dot-dash-plot combines the two fundamental graphical 
designs used in statistical analysis, the marginal frequency distri- 
bution and the bivariate distribution. Dot-dash-plots make routine 
what good data analysts do already— plotting marginal and joint 
distributions together. 

An empirical cumulative distribution of residuals on a normal 
grid shows the outer 18 terms plus the 30th term, with all 60 
points plotted in the marginal distribution: 


L Cuthbert Daniel, Applications of Statistics 
ar . to Industrial Experimentation (New York, 


-27 


A O12 3 | 3 5 10 20 304030 40 70 89 90 95 38 99995 1976), P- 155. 


134 THEORY OF DATA GRAPHICS 


Similarly, this data-rich graphic of signals from pulsars shows both Timothy H. Hankins and Barney J. 
Rickett, "Pulsar Signal Processing," in 
Berni Alder, et al., eds., Methods in 
Computational Physics, Volume 14: Radio 
Astronomy (New York, 1975), p. 108. 


marginal distributions: 


RESOLUTION 
0 CELL 
10 
8 
$ [j 
ә 
= 
ш 
= 
ср а 
2 
0 
125 kHz 
WLS MHz 
FREQUENCY 


Narrowband spectra of individual subpulses. Each point of the intensity 
1.0) plotted on the right is the sum of the distribution of intensities across the 
receiver bandwidth shown in the center. At the top is plotted the spectrum averaged 
over the pulse. In the limit of many thousands of pulses this would show the receiver 
bandpass shape. 


DATA-INK MAXIMIZATION 135 


The fringe of dashes in the dot-dash-plot can connect a series of 
bivariate scatters in a rugplot (since it resembles a set of fringed 
rugs—and covers the statistical ground): 


F 
D oec Je 
ПЕ SHE 
a SG = e 
ioe * n ° e 
П 1 П 
° — in 
e °? a 
Ер к ОО: ° 
б e H | • 
ТТПТТ I I LIH 
eae 
i „Ай = 
W- ee um 
. С 


Reflecting the one-dimensional projections from each scatter, the 
dashes encourage the eye to notice how each plot filters and trans- 
lates the data through the scatter from one adjacent plot to the 
next. Sometimes it is useful to think of each bivariate scatter as the 
imperfect empirical representation of an underlying curve that 
transforms one variable into another. In the rugplot, the sequence 
of variables can wander off as appropriate. The quantitative history 
of a single observation can be traced through a series of one- and 
two-dimensional contexts. 


136 THEORY OF DATA GRAPHICS 


Conclusion 


- The first part of a theory of data graphics is in place. Тһе idea, as 
described in the previous three chapters, is that most of a graphic’s 
ink should vary in response to data variation. The theory has 
something to say about a great variety of graphics—workaday 
scientific charts, the unique drawings of Roger Hayward, the 
exemplars of graphical handbooks, newspaper displays, computer 
graphics, standard statistical graphics, and the recent inventions of 
Chernoff and Tukey. 

The observed increases in efficiency, in how much of the graphic’s 
ink carries information, are sometimes quite large. In several cases, 
the data-ink ratio increased from .1 or .2 to nearly 1.0. The trans- 
formed designs are less cluttered and can be shrunk down more 
readily than the originals. 

But, are the transformed designs better? 

(1) They are necessarily better within the principles of the theory, 
for more information per unit of space and per unit of ink is dis- 
played. And this is significant; indeed, the history of devices for 
communicating information is written in terms of increases in 
efficiency of communication and production. | 

(2) Graphics are almost always going to improve as they go 
through editing, revision, and testing against different design op- 
tions. The principles of maximizing data-ink and erasing generate 
graphical alternatives and also suggest a direction in which revi- 
sions should move. 

(3) Then there is the audience: will those looking at the new 
designs be confused? Some of the designs are self-explanatory, as 
in the case of the range-frame. The dot-dash-plot is more difficult, 
although it still shows all the standard information found in the 
scatterplot. Nothing is lost to those puzzled by the frame of dashes, 
and something is gained by those who do understand. Moreover, 
it is a frequent mistake in thinking about statistical graphics to, 
underestimate the audience. Instead, why not assume that if you 
understand it, most other readers will, too? Graphics should be as 
intelligent and sophisticated as the accompanying text. ` 

(4) Some of the new designs may appear odd, but this is probably 
because we have not seen them before. The conventional designs 
for statistical graphics have been viewed thousands of times by 
nearly every reader of this book; on the other hand, the range- 
frame, the dot-dash-plot, the white grid, the quartile plot, the 
rugplot, and the half-face just a few times. With use, the new 
designs will come to look just as reasonable as the old. 


DATA-INK MAXIMIZATION 


Maximizing data ink (within reason) is but a single dimension of 
a complex and multivariate design task. The principle helps con- 

duct experiments in graphical design. Some of those experiments 
will succeed. There remain, however, many other considerations 

in the design of statistical graphics—not only of efficiency, 

but also of complexity, structure, density, and even beauty. 


137 


7  Multifunctioning Graphical Elements 


The same ink should often serve more than one graphical purpose. 
A graphical element may carry data information and also perform 
a design function usually left to non-data-ink. Or it might show 
several different pieces of data. Such multifunctioning graphical 
elements, if designed with care and subtlety, can effectively display 
complex, multivariate data. 1The idea of double-functioning ele- 
Consider, for example, the multifunctioning blot of the blot ments appears in architectural criticism; 
А 2 Е see Robert Venturi, Complexity and Con- 
map. It simultaneously locates the geographic unit on a two- tradiction in Architecture (New York, 
dimensional surface, describes the shape of the geographic unit, second edition, 1977), ch. 5. Venturi in 
and indicates the level of the variable displayed by color or inten- АИА о и 
sity of shading. That is a great deal of information for a small 1955). 
patch of ink—and the different pieces of information are not 
confounded and mixed together. 
In contrast, the conventional graphical frame performs only a 
modest design function, the separation of the grid and data mea- 
sures from the labels. And it is a place to hang the grid ticks. With 
all that ink doing so little, it is a prime candidate for mobilization 
as a double-functioning graphical element. Hence the range-frame, 
the quartile frame, and the dot-dash-plot. 
The principle, then, is: 


Mobilize every graphical element, perhaps 
several times over, to show the data. 


The danger of multifunctioning elements is that they tend to 
generate graphical puzzles, with encodings that can only be broken 
by their inventor. Thus design techniques for enhancing graphical 
clarity in the face of complexity must be developed along with 
multifunctioning elements. 


Data-Built Data Measures 


The graphical element that actually locates or plots the data is the 
data measure. The bars of a bar chart, the dots of a scatterplot, the 
dots and dashes of a dot-dash-plot, the blots of a blot map are 
data measures. The ink of the data measure can itself carry data; 
for example, the dots of the scatterplot can take on different 
shadings in response to a third variable. 


140 THEORY OF DATA GRAPHICS 


Building data measures out of the data increases the quantitative 
detail and dimensionality of a graphic. The stem-and-leaf plot 
constructs the distribution of a variable with numbers themselves: 


98766562 

97719630 
6998776654442221 1009850 
876655412099551426 
9998844331929433361 107 
97666666554422210097731 
898665441077761065 
98855431100652108073 
653322122937 
377655421000493 
0984433165212 
4963201631 

45421164 

47830 

00 

676 

52 

92 

5 

39730 


0|9 = 900 feet 


Stem-and-leaf displays: 
heights of 218 volcanoes, unit 100 feet. 


————————— ым 
NO осо мз Os tA . UJ TO ONO мд ON CA & фо м – О 


19 |3 = 19,300 feet 


The idea of making every graphical element effective was behind 
the design of the stem-and-leaf plot. In presenting his invention, 
John Tukey wrote: "If we are going to make a mark, it may as well 
be a meaningful one. The simplest—and most useful — meaningful 


mark is a digit.”? 
Here, too, the data form the data measure. Note the bimodal 
distribution in the histogram of college students arranged by height. 


2" Some Graphic and Semigraphic Dis- 
plays," in T. A. Bancroft, ed., Statistical 
Papers in Honor of George W. Snedecor 
(Ames, Iowa, 1972), p. 296. 


Brian L. Joiner, "Living Histograms,” 
International Statistical Review, 43 (1975), 
339-340. But, for further developments, 
see Mark Е. Schilling, Ann E. Watkins, 
and William Watkins, "Is Human Height 
Bimodal?” The American Statistician, 56 
(August 2002), 223-229. 


MULTIFUNCTIONING GRAPHICAL ELEMENTS 141 


A distinguished graphic that builds data measures out of data was Leonard P. Ayres, The War with Ger- 
designed by Colonel Leonard P. Ayres for his statistical history of many (Washington, D.C., 1919), р. 102. 
World War I, a book with several notable graphics all done by 
typewriter and rule. Constructing the data measures out of each 
American division's name (a numerical designation) turns what 
might have been a routine time-series into an elegant display. (Note 
that the cumulative design depends on the fact that none of the 
divisions returned before October 1918.) The triple-functioning 
data measure shows: (1) the number of divisions in France for 
each month, June 1917 to October 1918; (2) what particular divi- 
sions were in France in each month; and (3) the duration of each 
division's presence in France. 


8 

38 

31 

$4 34 

86 86 

04 04 

87 87 

40 40 40 

39 39 39 

88 88 88 

81 81 81 

ттт 

85 85 85 

$6 36 56 386 

91 91 91 91 

79 79 79 79 

76 76 76 76 

29 29 29 29 29 

7 57 o? 9T A 

90 90 90 90 90 

92 92 92 92 92 

69 89 89 89 89 

83 85 85 83 85 

78 78 "6 78 78 

80 80 80 80 80 80 

50 50 50 50 30 30 

33 55 55 535 35 85 

6 6 6 6 6 6 

27 27 27 27 21 og 

4 4 4 4 4 4 

28 28 28 28 28 28 

$5 35 55 35 35 35 

82 82 82 82 682 вг 

"tov? 77 77 т 77 UC? 

3 3 3 3 9$ 3 ¥5 3 

5 5 5 B b b 5 Б 

32 32 32 32 $2 52 32 52 32 

41 41 41 41 41 41 41 41 41 41 4 
42 42 42 42 42 42 42 42 42 42 42 42 
26 26 26 26 26 26 26 26 26 26 26 26 26 26 
2 2 8 8 2 2 2 2 2 2 2 2 2 2 2 
i. 1.1 1 1 1 t l1 i1 1 Li 1i Li i 1l. i X 
Jun Jui Aug Sep Oct Nov Deo [Jan Feb Mar Apr May Jun Jul Aug Sep Oot 


1917 1918 


142 THEORY OF DATA GRAPHICS 


Encoding of data measures can be far more elaborate. The 
plotted points here are Chernoff faces, which reduce well, main- 
taining legibility even with individual areas of .0$ square inches 
as shown. The analyst would observe the standard X-Y scatter- 
plot and then turn to the within-scatter detail, seeking clusters of 
similar observations over the X-Y plane. Outlying faces and those 
inconsistent with others in the neighborhood—they are, of course, 
strangers—should be identified by observation number or name. 


| © 
©) 


OF е 
e e » 


go © 


With cartoon faces and even numbers becoming data measutes, 
we would appear to have reached the limit of graphical economy 
of presentation, imagination, and, let it be admitted, eccentricity. 


3 Herman Chernoff, “The Use of Faces 
to Represent Points in k-Dimensional 
Space Graphically," fournal of the Amer- 
ican Statistical Association 68 (June 1973), 
361-368. For an application of faces lo- 
cated over two dimensions, see Howard 
Wainer and David Thissen, “Graphical 
Data Analysis,” Annual Review of Psy- 
chology, 32 (1981), 191-241. 


A stranger 


MULTIFUNCTIONING GRAPHICAL ELEMENTS 143 


But let us consider this shaped poem, “Easter Wings” by George 4 For a remarkable oTSOG-like tour of 
Herbert (1593-1633), which uses space—the length of each line— the many typographical variant shapes 

М E AA Е of “Easter Wings” in its long publication 
to depict quantity, all done 150 years before Playfair.* The lines history, see the essay “FIAT {LUX,” by 
double-function: the longer lines describe wealth, plenty, largesse, "Random Cloud" in Randall McLeod, 


ed., Crisis in Editing: Texts of the English 


and rising to flight; shorter lines tell of poverty and becoming ра (NER York 1994) O 


“most thinne”; and lines of intermediate length indicate transition 
and change (decaying, rising, combining, becoming): 


Easter-wings. 


Ord, who createdst man in wealth and store, 
IZ Though foolishly he lost the same, 
Decaying more and more, 
Til he became 
Most poore: 
With thee 
O let me rise 
As larks, harmoniously, 
And sing this day thy victories: 
Then shall the fall further the flight in me. 


My tender age in sorrow did beginne: 
And still with sicknesses and shame 
Thou didst so punish sinne, 

That I became 
Most thinne. 

With thee 
Let me combine 
And feel this day thy victorie : 

For, if I imp my wing on thine, 
Affiiction shall advance the flight in me. 


And the typographical delight of the statistician W. J. Youden: 


THE 
NORMAL 
LAW OF ERROR 
STANDS OUT IN THE 
EXPERIENCE OF MANKIND 
AS ONE OF THE BROADEST 
GENERALIZATIONS OF NATURAL 
PHILOSOPHY € IT GERVES АВ THE 
GUIDING INSTRUMENT IN RESEARCHES 
IN THE PHYSICAL AND SOCIAL SCIENCES AND 
IN MEDICINE AGRICULTURE AND ENGINEERING Ф 
IT 18 AN INDISPENSABLE TOOL FOR THE ANALYSIB AND THE 
INTERPRETATION OF THE BASIC DATA OBTAINED EY OBSERVATION AND EXPERIMENT 


144 THEORY OF DATA GRAPHICS 


Finally, this graphical pun: the visual data as the data measure, Redrawn from A. R. Lauer, “Psycho- 


as in the living histogram. The chart shows how states once dif- logical Factors in Effective Traffic Соп- 
trol Devices,” Traffic Quarterly, 5 


fered in their engineering standards for painting lane stripes on (January 1951), 94. 


road pavement. Some states marked the road lanes with short 
dashes and long gaps; others used only solid lines. Portrayed in 
the graphic is the actual physical pattern painted on the road, with 
48 U.S. states ordered by the length of the painted mark: 


feet 


California 
Missouri 
Münnesota 
Alabama 
Arizona 
Colorado 
Florida 
Georgia 
Kentucky 
Louisiana 
Maine 
Massachusetts 
Mississippi 
Nebraska 
Nevada 

New Hampshire 
New Mexico 
New York 
North Carolina 
Oregon 
Pennsylvania 
Washington 
Delaware 
lowa 
Wyoming 
Connecticut 
Vermont 
‘Wisconsin 
Rhode Island 
Kansas 

West Virginia 
Idaho 
Michigan 
Arkansas 
North Dakota 
Maryland 
Montana 
Virginia 
South Carolina 
New Jersey 
Illinois 
Indiana 

Ohio 
Oklahoma 
South Dakota 
Tennessee 
Texas 

Utah 


I 
| 
! 

І 
| 

L| 
11 

i 
I 

i 
am 

i 

! 

I 

3 

l 

i 


FELEEEEEELE LETTERE I 
IIIIIIIIITIIIILIIIII!, 
ЖЕЛЕЛЕР 
IIIIIIIIIIIIIIITIII 
ДЕЛЛЕ 


ттт 


ЮПИ dee E E UEEEEEEEEEE EEG IN , 
ШЕ 


ЕШ БЕЯ ЕЛ ЕЕН Р ТКЕЕ ЕЕЕ ЕЛ, 


| 


MULTIFUNCTIONING GRAPHICAL ELEMENTS 145 


Data-Based Grids 


Very occasionally the grid can report directly on the data. This 
grid is formed by the location of measurement instruments; the 
plain dots register a zero reading, in contrast with the white back- 
ground where no readings were taken. Erasing the grid would 
erase measured data (rather uneventful, to be sure). Such is not 
the case for most grid dots, ticks, and lines. 


K. V. Roberts and D. E. Potter, “Mag- 
netohydrodynamic Calculations," in 
Berni Alder, et al., eds., Methods in 
Computational Physics: Volume 9, Plasma 
Physics (New York, 1970), p. 402. 


The arrangement of data in this table-graphic yields an internal 
grid, a rare example of data as grid: 


MID-PARENTS ADULT CHILDREN 
"Aa cha =i their Heights, and Deviations from 684 inches. 


‚ 3 
Heights | Deviates 64 65 66 67 68 69 70 71 72 7 
in 


Karl Pearson, The Life, Letters and La- 
bours of Francis Galton (Cambridge, 
1930), vol. Ш-А, 14. 


146 THEORY OF DATA GRAPHICS 


The United States in North America 
(Mitchell Map) 


Below is a modem тар on the Mercator projection showing the configuration of the eastern portion of North 
rug America, with major drainage features. The labeled grid is an arbitrary one, however, designed to facilitate com- 

ES parison between present-day knowledge of the geography of North America and the state of knowledge that existed 

S X. in the mid-i8th century when John Mitchell made the famous map (simplified and re-drawn here about 1/5 the 
7 width of the original) on which the original boundaries of the United States were marked in 1783. In order to show 

the deformation of earth surface that Mitchell incorporated into his map (from either ignorance or error), a grid 

has been constructed on Mitchell’s map that corresponds, square by square, with the rectangular grid on today’s map. 


u = 
NNI Si z Since each labeled square on the Mitchell map has a counterpart on the modern map, the relative stretching, com- 
И | pressing, and twisting of the earth surface on the Mitchell map can be perceived. 
cel | | 


T EE GE s[s[7 a [o [зо n TES [19] 15 [se [vz [8 [18 [20 | 21 [22 [25 [22] 25 [26 27 [28 [29 [39 [31 [3233 [34 35 
LE Ф К NI لا‎ b 
— B 1 

=з. єз ЕЕ ima 1 p fat 

[т = I. ай ` > 
Е L = У 

~ cd i id a 2 

/ e 11| Б 
н а Е 
p 
1 
Lj 
LO 1 
M E 
m P 
feit 1 P 
[ГЕ ! л A 
a 4 1 
3 f NE 
5 / 
a T : 
J Y EN | i 1975 
Б Т CHE The United States in North America 

—— Boundary line, Treaty of Paris, 1783 м EM qi | —— Modern boundary between Canada and United States 

x -oo ЖЩ 1 
ЭЁ اس‎ 

Hete the grid is the element of interest, rather than the map. Lester J. Cappon, Barbara Bartz Petche- 


nik, and John Hamilton Long, Atlas of 
Early American History (Princeton, 1976), 


р. 58. 


MULTIFUNCTIONING GRAPHICAL ELEMENTS 147 


The grid that follows presents the data on the surface of the rock; Philip E. Converse, “Religion and Pol- 
on the sides, the grid is conventional. The two displays compare itics: The 1960 Election,” in Angus 
A : Ё A Campbell, Philip E. Converse, Warren 
the effect of religion, taking into account party affiliation, on a E. Miller, and Donald E. Stokes, Elections 
person's vote for president in 1956 and in 1960 (when a Catholic and the Political Order (New York, 


А : : s ; 66), 102-103. 
ran for president). Note there is no reliable slope associated with EN 


religion in 1956, once party is controlled; in 1960, a systematic 
effect is found. Reading the slopes in the other direction shows the 
persistent effect of party in both elections: 


100% 


ЕА 
© 
© 

39 


M 
a 
X 


25% | 


Democratic percentage of the two-party vote 


0% — کر ت‎ 
Strong Weak Weak Strong © 
Democrat Independent Republican 


Party Identification 


100% 


75% 


50% 


25% 


Democratic percentage of the two-party vote 


ў ES S © 
езд Ф 
0% £ e s 
Strong Weak Weak Strong ® 
Democrat Independent Republican 
Party Identification 


148 THEORY OF DATA GRAPHICS 


Playfair tied the grid to the data in his skyrocketing debt graphic. 
Although the implicit plotting coordinates are based on regular 
intervals, the vertical grid lines in the published version are irreg- 
ularly spaced, keyed to significant events. The data-based grid is 
a shrewd graphical device, serving rather than fighting with the 
data. It is a technique underused in contemporary graphical work. 


Coat Sr Cui" 
BRITAIN HE RE ЕР REVOLUTION Ў 
го Me Cn et а e Wer 4 | 


e C 


d 
| 
i 


Aot. 


4 
o 


e 


TIE 


quoda 


4027) 


^ 
a oh 


a TT йй) 


T! 


EN 


MULTIFUNCTIONING GRAPHICAL ELEMENTS 


Double-Functioning Labels 


Data-based coordinate lines lead to data-based labels, as, for example, 
at the bottom of Playfair’s debt graphic. Again, the issue is the 
same: why not use the ink to show data? Beginning with conven- 
tionally labeled frame 


0 10 20 30 40 


leaves those lonely ticks and numbers out on the tails, working to 
help the eye get a better reading on where the line of the range- 
frame ends. But that job can be done better by investing the same 
ink in data: rather than showing the minimum round number 
and the maximum round number at the ends of the frame, show 
the actual minimum and maximum realized in the data: 


With its greater precision and two tick-marks less of non-data- 
ink, the range-frame with range-labels is superior to the range- 
frame with round number labels. Both improve on the standard, 
passive frame. 


Numbers also double-function when used both to name things 
(like an identification number) and to reflect an ordering. In this 
graphic (in which the circled numbers fail to double-function), 
each number identifies a particular study of the thermal conduc- 
tivity of tungsten, ordered alphabetically by the last name of the 
first author. If that list were ordered by date of publication in- 
stead, then the code would also indicate the time order in which 


149 


1$0 THEORY OF DATA GRAPHICS 


“99 


the various conductivity determinations were made. Thus “1 
would indicate the earliest study, and so on—or, alternatively, 

~ “61c” would be the third study published in 1961. Such informa- 
tion has interest, since we could see which of the early studies got 
the right answer. In addition, the movement of the studies toward 
the “correct” recommended values could be tracked. This extra 
information requires no additional ink. 


C. Y. Ho, R. W. Powell, and P. E. Liley, 
Thermal Conductivity of the Elements: A 
Comprehensive Review, supplement no. 
1, Journal of Physical and Chemical Ref- 
erence Data, 3 (1974), 1-692. 


س 
i |‏ 


| | 


1 | 


TUNGSTEN 


THERMAL CONDUCTIVITY, Wem К 
5 = m m * 


o 
© 


0.8 


o7 


0.6 


05 | 


| 


~ THERMAL CONDUCTIVITY OF 


| 


== 
NEN | 
_ PROVISIONAL (liquid) - / M 


1 
M.P. 3660 K ! 


| 3 | 
—" | эч [d ec : 
о 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 


TEMPERATURE, К · . 


In most graphics, the coordinate labels are far from the data 
measures. Consequently the eye of the viewer must move back 
and forth between the path formed by the data and the coordinate 
positions arrayed along the margins of the graphic. Sometimes this 
eye-work can be eliminated entirely by turning the coordinate 
labels into data measures, another double-functioning maneuver. 
Take the example from the style sheet of the Journal of the Amer- 
ican Statistical Association: 


2400 2600 2800 3000 3200 3400 3600 3800 4000 


9104 


MULTIFUNCTIONING GRAPHICAL ELEMENTS 151 


The grid increments of the X-axis are relocated upward to mark 


the path of the data: 
0 
45 
1 
40 
2 

05 3 

4 

: 6 
? $5 

.00 10 11 12 13 14 15 


And since the issue in this display is the probability at each integer 
value, the round-number Y-scale is replaced by exact values: 


177 0 


114 1 


.075 2 


.052 3 


.034 4 
.025 5 


.004 8 9 
.002 10 11 12 13 14 15 


The Y-scale now resembles the dashes of the dot-dash-plot, with 
the vertical column of data-positioned numbers serving as the 
dashes to indicate the marginal distribution. 


1§2 THEORY OF DATA GRAPHICS 


The method of data-based markers for the marginal distribu- 
tions suggests a further enhancement of the dot-dash-plot: 


20.3 . 
15.2 . 
14.6 i 
11.3 . 
101 ° 
8.4 . 
5.1 D 
s 8 а фы B 


Now the numbers in the margin eliminate the standard frame 

and even a range-frame, replace the coordinate ticks, show the 

marginal distribution of both variables, and record the exact values 

of the two measurements made on each unit of observation. This 

graphical arrangement performs better for smaller data sets (say 30 

observations or less) and when a fine level of detail is required. 
Finally, a striking design with data-based labels: 


Designed by Carol Moore, Corporate 
Annual Reports, Inc., in Walter Herdeg, 
Graphis/ Diagrams (Zurich, 1976), p. 23. 


MULTIFUNCTIONING GRAPHICAL ELEMENTS 153 


Puzzles and Hierarchy in Graphics 


The complexity of multifunctioning elements can sometimes turn 
data graphics into visual puzzles, crypto-graphical mysteries for 
the viewer to decode. A sure sign of a puzzle is that the graphic 
must be interpreted through a verbal rather than a visual process. 
For example, despite its clever and multifunctioning data measure, 
formed by crossing two four-color grids, this is a puzzle graphic. 
Deployed here, in a feat of technological virtuosity, are 16 shades 
of color spread on 3,056 counties, a monument to a sophisticated 
computer graphics system.? But it is surely a graphic experienced 
verbally, not visually. Over and over, the viewers must run 
little phrases through their minds, trying to maintain the right 
pattern of words to make sense out of the visual montage: “Now 
let's see, purple represents counties where there are both high levels 
of male cardiovascular disease mortality and 11.6 to $6.0 percent 
of the households have more than 1.01 persons per room. 
. .. What does that mean anyway? ... And the yellow-green 
counties. . . ." By contrast, in a non-puzzle graphic, the transla- 
tion of visual to verbal is quickly learned, automatic, and implicit 
—so that the visual image flows right through the verbal decoder 
initially necessary to understand the graphic. As Paul Valéry wrote, 
"Seeing is forgetting thc name of the thing one sees." 


5'The technique is described in Vincent 
P. Barabba and Alva L. Finkner, “The 
Utilization of Primary Printing Colors 
in Displaying More than One Variable," 
in Bureau of the Census, Technical Paper 
No. 43, Graphical Presentation of Statistical 
Information (Washington, D.C., 1978), 
14-21. The maps are assessed in Howard 
Wainer and C. M. Francolini, "An 
Empirical Inquiry Concerning Human 
Understanding of Two-Variable Color 
Maps," American Statistician, 34 (1980), 
81-93. 


93552 230996 


71895 ~ 827.78 


i 827.79. 938.61 
# 
H 
Ё 


000 - 71894 


154 THEORY OF DATA GRAPHICS 


Color often generates graphical puzzles. Despite our experiences 
with the spectrum in science textbooks and rainbows, the mind's 
eye does not readily give a visual ordering to colors, except pos- 
sibly for red to reflect higher levels than other colors, as in the 
hot spots of the cancer map. Attempts to give colors an order 
result in those verbal decoders and the mumbling of little mental 
phrases again —indeed, even mnemonic phrases about the phrases 
required for graphical decoding: 


A method of coloring ingenious in idea but not very satisfactory 
in practice was used by L. L. Vauthier. It was called the 
mountain-to-the-sea method. White was used for the repre- 
sentation of the greatest intensity of the fact because it indicated 


the summit of a mountain with its eternal snow, next came 6H. Gray Funkhouser, “Historical De- 
green representing the forests farther down the slopes, then velopment of the Graphical Representa- 
yellow for the grain of the plains, and finally for the minimum НА of See Dae Osiris, 3 (1937), 

6 326, who cites E. €ysson, es те- 
the blue of the waters at sea level: ОСА graphicue 4 Expos 
i А ў sition universelle de 1878," Journal de la 

Because they do have a natural visual hierarchy, varying shades of Société dé Statistique de A (1878), 

gray show varying quantities better than color. Ten gray shades 331. 


worked effectively in the galaxies map: 


The success of gray compared to the visually more spectacular 
color gives us a lead on how multifunctioning graphical elements 
can communicate complex information without turning into puz- 
zles. The shades of gray provide an easily comprehended order to 
the data measures. This is the key. Central to maintaining clarity 
in the face of the complex are graphical methods that organize and 
order the flow of graphical information presented to the eye. 

How can graphical architecture promote the ordered, sequenced, 
hierarchical flow of information from the graphic to the mind’s 
eye? How can the data-information be arranged so that the viewer 
is able to peel back layer after layer of data from a graphic? 

Multiple layers of information are created by multiple viewing 
depths and multiple viewing angles. 


MULTIFUNCTIONING GRAPHICAL ELEMENTS 


Graphics can be designed to have at least three viewing depths: 
(1) what is seen from a distance, an overall structure usually aggre- 
gated from an underlying microstructure; (2) what is seen up close 
and in detail, the fine structure of the data; and (3) what is seen 
implicitly, underlying the graphic—that which is behind the graphic. 
Look at all the different levels of detail created by this population 
density map of the United States, a glory of modern cartography 
prepared by the Bureau of the Census. Each dot, except in urban 
centers, represents $00 people. Note the corridors connecting the 
major urban complexes; the effects of landforms on the population 
distribution (the central valley of California, the valleys and ridges 
of Appalachia, and the clusters along rivers); and the small towns 
along the highways, linked like a string of pearls. The map arrays, 
in effect, some 400,000 points on its implicit grid. 

Different visual angles for different aspects of the data also or- 
ganize graphical information. Each separate line of sight should 
remain unchanging (preferably horizontal or vertical) as the eye 
watches for data variation off the flat of the line of sight. For mul- 
tivariate work, several clear lincs can be created. Recall Ayres’ 
display of American divisions in France. Even with its complex, 
interwoven data, the graphic is not a puzzle. Three separate visual 
angles make the flow of information coherent: the profile of the 
horizon for the upward-moving time-series, the vertical for the 
composition of the bar, and the horizontal for each division’s stay. 
Thus while every drop of ink serves three different data display 
functions, each of the three comes to the eye with its own inde- 
pendence and integrity. 


BS.S.68828953302258 


Flew SSE Sand RAS a Rak SSIRSSSIBATESRaLSSE 
Rae Ro RSSIRSRSISSTILSSsSSSEASRE 


Blew SSE Goad RG Ro FSFSASSSIBS TESS wBSSEIERL IS. 


e388B.N.ESSSESSSUHU 
389 


НЕЗ 


5 
2 
т т тт 
у 5 в 5 
5 Б 5 6 5 
32 32 32 32 32 $2 
а 4 41 4 4 4 41 41 
42 42 42 42 42 42 42 42 42 
26 26 26 26 26 26 26 05 26 26 25 
222 2 22 2 8 Ё 2 2 2 
I 3.3-1-L 1.1 2.1 11 1-173 d 
Jun Jul ang Sep Oot Шот Гео [Јел Pob Mar Apr Мау Jun Jul Sep Oot 


ы 
2 
E 
ы 
5 
E 
Lj 


155 


158 THEORY OF DATA GRAPHICS 


Current Receipts of Government as a 
Percentage of Gross Domestic 
Product, 1970 and 1979 


Sweden 


Netherlands 
Norway 


Britain 


France 
Germany 
Belgium 


Canada 
Finland 


Italy 
United States 


Greece 
Switzerland 


Spain 


Japan 


1970 


26.8 
26.5 


22.5 


20.7 


1979 


57.4 Sweden 


55.8 Netherlands 


$2.2 Norway 


43-4 France 
43.2 Belgium 
42.9 Germany 


39.0 Britain 
38.2 Finland 


35.8 Canada 
35.7 Italy 


33.2 Switzerland 
32.5 United States 


30.6 Greece 


274 Spain 
26.6 Japan 


MULTIFUNCTIONING GRAPHICAL ELEMENTS 159 


Similarly, this table-graphic organizes data for viewing in several 
directions. The chart, when read vertically, ranks 15 countries by 
government tax collections in 1970 and again in 1979, with the 
names spaced in proportion to the percentages. Across the columns, 
the paired comparisons show how the numbers changed over the 
years. The slopes are also compared by reading down the collec- 
tion of lines, and lines of unusual slope stand out from the overall 
upward pattern. The information shown is both integrated and 
separated: integrated through its connected content, separated in 
that the eye follows several different and uncluttered paths in look- 
ing over the data: 


E »- 
E а 
+ а 
+ a 
~ се 


Such an analysis of the viewing architecture of a graphic will help 
in creating and evaluating designs that organize complex 
information hierarchically. 


I want to reach that state of condensation of sensations 
which constitutes a picture. 


Henri Matisse 


8 Data Density and Small Multiples 


Our eyes can make a remarkable number of distinctions within a 
small area. With the use of very light grid lines, it is easy to 
locate 625 points in one square inch or, equivalently, 100 points 
in one square centimeter. 


Or consider how an 80 by 80 grid over a square inch—about 30 
by зо over a square centimeter — divides the space:! 


With the help of considerable redundancy and context, our eyes 
make fine distinctions of this sort all the time. Measurement instru- 
ments used in engineering, architectural, and machine work are 
engraved with scales of 20 increments to the centimeter and 50 
to the inch. Or consider the reading of fine print. The type in the 
U.S. Statistical Abstract is set at 12 lines per vertical inch, with each 
line running at about 23 characters per inch for a maximum den- 
sity of 276 characters per square inch. The actual density, given 
the white space, is in this case 185 characters per square inch or 
28 per square centimeter. 


No. 1450. STEEL PRopucTs—NET SHIPMENTS, BY MARKET CLASSES: 1960 To 1978 
(In thousands of short tons. Comprises carbon, alloy, and stainless steel. '* N.e.c." means not elsewhere classified] 


MARKET CLASS 1960 1965, 1970 1973 1974 1975 1976 1977 1978 


Total |....--__.---..--------- 71,149 | 92,666 | 90.798 |111,430 |109.472 | 79,957 | 89,447 | 91,147 | 97.935 


Steel for converting and processing.| 2,928 | 3,932 | 3,443 | 4,714 4,486 | 3,255 | 4,036 3,079 | 4,612 
2 841 1,250 | 1,048 | 1,213 | 1,339 | 1,098 952 998 1,192 
x 1,071 1,234 | 1,005 | 1,278 1,331 615 912 848 870 
Steel service centers, distributors..| 11,125 | 14,813 | 16,025 | 20,383 | 20,400 | 12,700 | 14,615 | 15,346 | 17,333 
Construction, inel. maintenanee...| 9,664 | 11,836 8,913 | 10,731 | 11,360 | 8,119 | 7,508 7,553 | 9,612 
Contractors’ products... -|.3,902 | 5,018 | 4,440 | 6,499 | 6,249 | 3,927 | 4,502 | 4,500| 3,480 
-| 14,610 | 20,123 | 14,475 | 23,217 | 18,928 | 15,214 | 21,361 | 21,490 | 21,253 


2,525 | 3,805 | 3,098 | 3,228 | 3,417 | 3,152] 3,056 | 3,288 | 3,549 


Independent forgers, n.e.c. 
Industrial fasteners 2. B 


Automotive. ........ 


Rail transportation. 
Freight cars, passenger cars, 


locomotives... -~ 
Rails and all other з... 
Shipbuilding and marine equip. 


Aircraft and aerospace. 78 94 56 9 79 69 63 60 
Oil and gas industries... -| 1,759 1,936 | 3,550] 3,405 | 4,210| 4,171| 2,653 3,050 | 4,140 
Mining, quarrying, and lum А 288 3 497 534 644 596 536 486 508 
Agricultural, inel. machinery...... 1,003 | 1,483 | 1,1260 | 1,772 | 1,859 | 1,429 | 1,784 | 1,743 | 1,805 


Machinery, industrial equip., tools 3,958 | 5,873 | 5,169 | 6,351 | 6,440 | 5,173 | 5,180 | 5,506 | 5,992 
Electrical equipment. ıı... 2,078 | 2,985 | 2,604 | 3,348 | 3,242 | 2,173 | 2,671 2,639 2,811 
Appliances, utensils, and cutlery... 1,760 | 2,170 | 2,160 | 2,747 | 2,412  1,053| 1,950 | 2,129 | 2,094 
Other domestic commercial equip. 1,959 | 2,179 | 1,778 | 1,990 | 1,941 1,390 | 1,813 | 1,846 | 1,889 
Containers, packaging, shipping... 6,429 | 7,331 7,775 | 7,8911 8,218 | 6,053 | 8,914 6,714 | 6,595 


Cans and closures 


Ordnance and other military......| — 165 | 289 1,222 918] 054  405| 219 193 | 207 
Exports (reporting companies only) 2,563 | 2,078 | 5,985 | 3,138 | 3,961 1,755 | 1,839 | 1,076 | 1,224 


1 Total includes nonclassifled shipments, and, beginning 1970, data include estimates for a relatively small 


number of companies which report raw steel production but not shipments. ? Bolts, nuts, rivets, and screws. 
3 Includes railways, rapid transit systems, railroad rails, track work, and equipment. 


25,281 distinctions 


1A square grid formed on each side by 
n parallel black and n-4 parallel white 
lines contains n? intersections of two 
black lines (corners of squares), (n-1)? 
intersections of two white lines (white 


-squares), and 2n (n-1) intersections of a 


black and white line (sides of squares), 
for a total of (2n-1)? line intersections 
or distinct locations. 


U.S. Bureau of the Census, Statistical 
Abstract of the United States: 1979 (Wash- 
ington, D.C., 1979), p. 822. 


162 THEORY OF DATA GRAPHICS 


Maps routinely present even finer detail. A cartographer writes 2D. P. Bickmore, “The Relevance of 
that "the resolving power of the eye enables it to differentiate to Санорырһуу" ш Топ Ce Davis and 
è Michael J. McCullagh, eds., Display and 
0.1 mm where provoked to do so. Clearly, therefore, conciseness Analysis of Spatial Data (London, 1975), 
is of the essence and high resolution graphics are a common р. 331. 
denominator of cartography.”? Distinctions at 0.1 mm mean 


254 per inch. 


How many statistical graphics take advantage of the ability of the 
eye to detect large amounts of information in small spaces? And 
how much information should graphics show? Let us begin by 
considering an empirical measure of graphical performance, the 


data density. 


Data Density in Graphical Practice 


The numbers that go into a graphic can be organized into a data 
matrix of observations by variables. Taking into account the size 
of the graphic in relation to the amount of data displayed yields 
the data density: 

number of entries in data matrix 


data density of a graphic = 
area of data graphic 


Data matrices and data densities vary enormously in practice. 

At one extreme, this overwrought display (originally printed in 
five colors) presents a data matrix of four entries, the names and 
the numbers for the two bars on the right. The left bar is merely 
the total of the other two. The graph covers 26.5 square inches 
(171 square centimeters), resulting in a data density of .15 num- 
bers per square inch (.02 numbers per square centimeter), which 
is thin indeed. 


DATA DENSITY AND SMALL MULTIPLES 163 


Percent 
35 


25 


20 


TOTAL In college In adult 
PARTICIPATION or university education 


Executive Office of the President, Office 
of Management and Budget, Social 
Indicators, 1973 (Washington, D.C., 
1973), p. 86. 


164 THEORY OF DATA GRAPHICS 


The exemplar from the JASA style sheet comes in at a light- 
weight 3.8 numbers per square inch (о.б numbers per square cen- 
timeter) and a small data matrix of 32 entries: 


AVERAGE PROBABILITY 


015 


0.10 


0.05 


In contrast, the New York weather history, in this reduced 
version, does very well at 181 numbers per square inch (28 per 
square centimeter): 


NEW YORK SIT: 5 WEATHER FOR 1 1980 


GMT 


Noon 


$ 021223 


мы 


з na 


мн 


45678 


071 2345678 


DATA DENSITY AND SMALL MULTIPLES 


An annual sunshine record reports about 1,000 numbers per 
square inch (160 per square centimeter): Maps and: Diagrams (Бойон third 


edition, 1971), pp. 242-243. 


165 


F. J. Monkhouse and H. R. Wilkinson, 


т 


Б nashi 


The visual metaphor corresponds appropriately to the data if the 
image is reversed, so that the light areas are the times when the 
sun shines: 


T T^ 
JANUARY FEBRUARY MARCH APRIL May JUNE au AUGUST SEPTEMBER OCTOBER. i NOVEMBER. DECEMBER 


166 THEORY OF DATA GRAPHICS 


PRO 
NIMES 


This map (27 square inches, 175 square centimeters) shows the 
location and boundaries of 30,000 communes of France. It would 
require at least 240,000 numbers to recreate the data of the map 
(30,000 latitudes, 30,000 longitudes, and perhaps six numbers 
describing the shape of each commune). Thus that data density 
is neatly 9,000 numbers per square inch, or 1,400 numbers per 
square centimeter. 

The new map of the galaxies locates 2,275,328 encoded rectangles 
on a two-dimensional surface of 61 square inches (390 squarc 
centimeters). Each rectangle represents three numbers (two by its 
location, one by its shading), yielding a data density of 110,000 
numbers per square inch or 17,000 numbers per square centimeter. 
That is the current record. 


Jacques Bertin, Semiologie Graphique 
(Paris, second edition, 1973), p. 152. 


DATA DENSITY AND SMALL MULTIPLES 


Data Density and the Size of the Data Matrix: 
Publication Practices 


The table shows the data density and the size of the data matrix 
for graphics sampled from scientific and news publications. At least 
20 graphics from each publication were examined. 

The table records an enormous diversity of graphical performances 
both within and between publications. A few data-rich designs 
appear in nearly every publication. The opportunity is there but 
it is rarely exploited: the average published graphic is rather thin, 


Data Density and Size of Data Matrix, 
Statistical Graphics in Selected Publications, Circa 1979-1980 


Data Density 


(Numbers per square inch) Size of Data Matrix 


median minimum maximum 


median minimum maximum 


Nature 48 3 362 177 15 3780 
Journal of the Royal 27 4 115 200 10 1460 
Statistical Society, B 
Science 21 5 44 109 26 316 
Wall Street Journal 19 3 154 135 28 788 
Fortune 18 5 31 96 42 156 
The Times (London) 18 2 122 50 14 440 
Journal of the American 17 4 167 150 46 1600 
Statistical Association 
Asahi 13 2 113 29 15 472 
New England Journal 12 3 923 84 8 3600 
of Medicine 
The Economist 9 1 51 36 3 192 
Le Monde 8 1 17 66 11 312 
Psychological Bulletin 8 1 74 46 8 420 
Journal of the American 7 1 39 53 14 735 
Medical Association 
New York Times 7 1 13 35 6 580 
Business Week 6 2 12 32 14 96 
Newsweek 6 1 13 23 2 96 
Annuaire Statistique 6 1 25 96 12 540 
de la France 
Scientific American 5 1 69 46 14 652 
Statistical Abstract of 5 2 23 38 8 164 
the United States 
American Political 2 1 10 16 9 40 
Science Review 
Pravda 0.2 0.1 1 5 4 20 


167 


168 THEORY OF DATA GRAPHICS 


based on about 5o numbers shown at the rate of 10 per square 
inch, Among the world’s newspapers, the Wall Street Journal, The 
Times (London), and Asahi publish data-rich graphics, with data 
densitics equal to those o£ the Journal of the American Statistical 
Association. Most of the American papers and magazines, along 
with Pravda, publish less data per graphic than the major papers 
of other industrialized countries. 

Very few statistical graphics achieve the information display 
rates found in maps. Highly detailed maps portray 100,000 to 
150,000 bits per square inch. For example, the average U.S. 
Geological Survey topographic quadrangle (measuring 17 by 23 
inches) is estimated to contain over 100 million bits of informa- 
tion, or about 250,000 per square inch (40,000 per square 
centimeter). Perhaps some day statistical graphics will perform 
as successfully as maps in carrying information. 


High-Information Graphics 


Data graphics should often be based on large rather than small 
data matrices and have a high rather than low data density. More 
information is better than less information, especially when the 
marginal costs of handling and interpreting additional information 
are low, as they are for most graphics. The simple things belong 
in tables or in the text; graphics can give a sense of large and 
complex data sets that cannot be managed in any other way. If 
the graphic becomes overcrowded (although several thousand 
numbers represented may be just fine), a variety of data-reduction 
techniques —averaging, clustering, smoothing—can thin the num- 
bers out before plotting.^ Summary graphics can emerge from 
high-information displays, but there is nowhere to go if we begin 
with a low-information design. 

Data-tich designs give a context and credibility to statistical 
evidence. Low-information designs are suspect: what is left out, 
what is hidden, why are we shown so little? High-density graphics 
help us to compare parts of the data by displaying much information 
within the view of the eye: we look at one page at a time and the 
more on the page, the more effective and comparative our eye 
can be.5 The principle, then, is: 


Maximize data density and the size of the data 
matrix, within reason. 


High-information graphics must be designed with special care. As 
the volume of data increases, data measures must shrink (smaller 
dots for scatters, thinner lines for busy time-series). The clutter of 


3 Morris M. Thompson, Maps for America 
(Washington, D.C., 1979), p. 187. 


4 Paul A. Tukey and John W. Tukey, 
“Summarization: Smoothing; Supple- 
mented Views,” in Vic Barnett, ed., 
Interpreting Multivariate Data (Chichester, 
England, 1982), ch. 12; and William S. 
Cleveland, "Robust Locally Weighted 
Regression and Smoothing Scatterplots,” 
Journal of the American Statistical Associa- 
tion, 74 (1979), 829-836. 


5It is suggested in the analysis of x-ray 
films to “search a reduced image so that 
the whole display can be perceived on 
at least one occasion without large eye 
movement." Edward Llewellyn 
Thomas, “Advice to the Searcher or 
What Do We Tell Them?” in Richard 
A. Monty and John W. Senders, eds., 
Eye Movements and Psychological Processes 
(Hillsdale, N.J., 1976), p. 349. 


DATA DENSITY AND SMALL MULTIPLES 


chartjunk, non-data-ink, and redundant data-ink is even more 
costly than usual in data-rich designs. 

The way to increase data density other than by enlarging the data 
matrix is to reduce the area of a graphic. The Shrink Principle has 
wide application: 


Graphics can be shrunk way down. 


Many data graphics can be reduced in area to half their currently 

published size with virtually no loss in legibility and information. 

For example, Bertin’s crisp and elegant line allows the display of Jacques Bertin, Semiologie Graphique 
17 small-scale graphics on a single page along with extensive text. (Paris, second edition, 1973), p. 214. 
Repeated application of the Shrink Principle leads to a powerful 


and effective graphical design, the small multiple. 


PROBLEMES GRAPHIQUES 
POSES PAR LES CHRONIQUES 


m è H 
H Un total sur deux cases (sur deux ans) doit être 
| divisé par deux (1). 
i 


Un total pour six mois sera multiplié par deux 
dans des cases annuelles. 


| 
| > J 
NNI Courbes trop pointues, réduire [échelle des Q; 


| la sensibilité angulaire s'inscrit dans une zone 
moyenne autour de 70^. 
Si la courbe n'est pas réductible (grandes ei 


$ T; petites variations) employer les colonnes rem- 
plies (5). 
> PAPA Courbes trop plates: augmenter l'échelle des Q. 


Variations trés faibles par rapport au total. 
Celui-ci perd de l'importance et le zéro peut 
être supprimé, à condition que le lecteur voit 
sa suppression (9). Le graphique peut étre inter- 
préte comme une accelération si l'étude fine des 
variations est nécessaire (échelle logarithmique 
(10) (v. p. 240). 


30 


Trés grande amplitude entre les valeurs extré- 
mes. П faut admettre : 
i 12 1°) Soit de ne pas percevoir les plus petites 


variations. 

29) Soit de ne s'intéresser qu'aux différences 

relatives (échelle logarithmique) sans connaitre 
p la quantité absolue. 


ID. i В 3°) Soit admettre des périodes différentes 
| dans la composante ordonnée et les traiter à 


des échelles différentes au-dessus de l'échelle 
commune (12). 


13 


Cycles trés marqués. 

Si l'étude porte sur la comparaison des phases 
de chaque cycle, il est préférable de décom- 
poser (13) de manière à superposer les cycles 
(14). La construction polaire peut ёге employée, 
de préférence dansune forme spirale (15) (ne pas 
commencer par un trop petit cercle); pour spec- 
taculaire qu'elle soit, elle est moins efficace 
que la construction orthogonale. 


Courbes annuelles de pluie ou de temperature. 
Un cycle possede deux phases (17), pourquoi 
n'en offrir qu'une à la perception du specta- 
teur ? (16). 


адлілі) 
3 


169 


170 THEORY OF DATA GRAPHICS 


Small Multiples 


Small multiples resemble the frames of a movie: a series of graphics, 
showing the same combination of variables, indexed by changes 

in another variable. Twenty-three hours of Los Angeles air pol- 
lution are organized into this display, based on a computer gen- 
erated video tape. Shown is the hourly average distribution of 
reactive hydrocarbon emissions. The design remains constant 
through all the frames, so that attention is devoted entirely to 
shifts in the data: 


From video tape by Gregory J. McRae, 
California Institute of Technology. 
The model is described in G. J. McRae, 
W. R. Goodin, and J. H. Seinfeld, 
"Development of a Second-Generation 
Mathematical Model for Urban Air 
Pollution. I. Model Formulation," 
Atmospheric Environment, 16 (1982), 
679-696. 


DATA DENSITY AND SMALL MULTIPLES 171 


These grim small multiples show the distribution of occurrence Arthur Wiskemann, “Zur Melanoment- 
of the cancer melanoma. The sites of 269 primary melanomas are stehang dúrchichronisehe Lichteinwit- 
К оя > kung,” Der Hautarzt, 25 (1974), 21. 
recorded, along with the distribution between men and women. 
Note the data graphical arithmetic, similar to that of the 
multiwindow plot. 


Abb. 1. Verteilung von 269 primáren Melanomen auf Kopf 
und Hals 


Abb. 3 


Abb. 2 и. 3. Differenzierung der Melanomverteilung 
nach Geschlechtern 


172 THEORY OF DATA GRAPHICS 


The effects of sampling errors are shown in these 12 distributions, Edmond A. Murphy, “One Cause? 


each based on a sample of 50 random normal deviates: Many Causes? The Argument from the 
Bimodal Distribution,” Journal of 


Chronic Diseases, 17 (1964), 309. 


PW Ww Ww мА м, 
А А, А. Ww ws 


These six distributions show the age composition of herring catches 
each year from 1908 to 1913. A tremendous number of herring 
were spawned in 1904, and that class began to dominate the 1908 
catch as four-year-olds, then the 1909 catch as five-year-olds, and 


so on: 
[EUER Au han Hjort, “Fluctuations in the Great 
789 WN BH 5 6 Johan Hjort, 
N SGD Fisheries of Northern Europe,” Rapports 
et Proces-Verbaux, 20 (1914), in Susan 
1908 Schlee, The Edge of an Unfamiliar World 


(New York, 1973), p. 226. 
El 


1909. 


EPA 1913. 
E у, 


345678 90N RBH 


B 


This next design compares a complex set of data; shown are the 
chromosomes of (from left to right) man, chimpanzee, gorilla, Jorge J. Yunis and Om Prakash, “The 
КИИ Pere Origin of Man: A Chromosomal Pic- 
and orangutan. The similarities between humans and the great 


torial Legacy,” Science, 215 (March 19, 
apes are to be noted. 1982), 1527. 


Lio ыы 


Ire per con ча aaa сее naa 
wo 15 lolal-| - fol ө | әрә Le | =o ds а 


Tz EN n сае е =ч 
E БЕ 


We He | = [sols Le 


a "Pol +} 


Tu Um ES 
EE aren Satin m e saner а emS яе сая 


Ё 3I ER 
Беа Ы ы l- |а = -|wlel + -l a lelsialel e 


БП B j E" . о | 


= 


TER M S 
-lalo je ele 


Les ls 


Grel- aiana E n m Lu 
1 


о ССТВСП ТЕТІ ШИН ШИЕ гиг) 
«BU Пугин ҮШ) 
CHE ЕП її ри MEN тИ Ер 


11 


10 


Poorer 


o јене ә | 
Ey 


Hel = hl el of zla Hel of o 
Ey 


4 


1 


13 


18 


174 THEORY OF DATA GRAPHICS 


And, finally, a visually similar small multiple, the Consumer 
Reports frequency-of-repair records for automobiles built from 


1976 to 1981. This is a particularly ingenious mix of table and 
graphic, portraying a complex set of comparisons between man- 
ufacturers, types of cars, year, and trouble spots. 


(D = Much better than average (С) = Better than average О) = Average € = Worse than average Q) = Much worse than average 
Chevrolet Malibu, Chevrolet Monza 4 Datsun 210, 8210 Trouble Spots Ford Granada 6 Ford pickup truck &(ZWD] Honda Accord 
Chevetle 6, V6 
|76 77 78 79 80 81) |76 7? 78 79 80 81 76 77 78 79 90 81. [76 77 78 79 80 8 26 77 78 79 80 81 76 7 78 78 80 $ 
OOOO00| oeooo | oeoeeo (exse ^— JOOOOO00| 000000| loooooo 
Oooeooo oeoeeo 800000 Body exterior [paint 600000 O00000 000000 
000000 60800 900000 Body exterior trusu ОООООО OOOOOO,| eeoooo 
OO00000| 80686600 ОООООО Body hardware оооооо| ОООООО O00000 
OOOOOO| 866000 900000 Body integrity OO060000| оооооо! O00000 
оооооо! ооооео ooeooo Brakes OOOOO00 OOOOO00 O00000 
eoooo оооооо Clutch ооо ОООООО O00000 
оооооо ооооо OOOO O O Driveline OOOOOO| #00000 O00000 
оооооо ооооо OOOOOO Electrical system {chassis} ОООООО оООООО OOO OOO 
000000 60060 OOOOOO Engine cooling OOOOOO OOOOOO оооевоо 
OOO0000, 06000 OOOOOO Engine mechanical оообоо 000000 9006000 
OO08000 8608000 OO #86000 Exhaust system OOOOO0O0| OOOOOO OCO06000 
OOO0O000 |@e000 O00000 Fuel system O00000 000000 O00000 
оооооо Oeooo оооооо ignition system OOO0600, ОООООО OOOOOO 
OOOOOO! Oooceeo OOOO OQ Suspension ОООООО 008000 OOOCOOO 
oooeo © ОО O OO. Transmission manual) ООО ООФОФО oooooo 
оооооо eoooo оооооо Transmission (automatic) оо®ФоОоо оФоооо 000000 
oooooo0| 000660 900000 Trouble Index ОООООО ОООООО OOOOOO 
OOOOO 00000 OOOOO0 Cost Index OOOOO OOOOO | ooooo 
Mercedes-Benz 300D Plymouth Volare 6 Subaru (except 4WD) Trouble Spots Toyota Corolla Volkswagen Rabbit Volvo 240 series. 
Bldiesel) {except Tercel) (diesel) 
76 77 78 79 80 81 76 7? 78 79 80 81 76 77 78 79 90 61 76 7? 78 79 80 81 76 7? 78 79 80 81 76 77 78 79 80 81 
00000 60000 000000 Air-conditioning O00000 0080 ooooeo 
OOOOO ооооо OOO COO! вати OOOOOO OOOOO| O00000 
O00000 608600 eeoooo Body exterior rust оооооо OOOO0, O00000 
OOOOO 600600 900000 Body hardware O00000 ООООО O00000 
(010101010) (KAK Ko) OOOCOOO Body integrity OOOOCOO OOO000, O00000 
ооооо (eE E E Ko) eoeooo Brakes Oooooo 00000} оооооо 
ово o OOOOCOO Clutch OOOCOCO 90000} O00000 
OOOOO ооооо оооооо Driveline ОООООО 00000; O00000 
ооооо 0@e00 OO OOOO amermar OOOO 00000; OOOOOO 
00000 OOO OO 600000 Engine cooling O00000 Oeeoo, O00000 
ооооо ооооо eooooo —— OOOOCOO 00000 O00000 
OOOO O OO0000 оооооо exhaust system оооооо O 6066060 oeeooo 
OOOOO | eeeee | 000000 Fueleystem OOOOOO | 00000 оооооо 
OOOOO oeeoo ОООООО таб ёре, loooooo OOOOOQ0| оооооо 
ооооо eoeoo OOOOOO Suspension O00000 ооооо O00000 
eee o OOOO OO) transmission (manual O00000 OOOOO OOOOOO 
00000 oooooQ OO OOOO) mess  (2OQOOO O00000 
OOOOO 00000 OOOOO0 Trouble Index OOO0O000 ООООО| OOOOOO 
oeeoce O00000 0000900 Cost Index OOO0OO OOOO ooooQ 


Consumer Reports, 47 (April 1982), 
199-207. Redrawn. 


DATA DENSITY AND SMALL MULTIPLES 175 


Conclusion 


Well-designed small multiples are 
* inevitably comparative 
* deftly multivariate 
* shrunken, high-density graphics 
* usually based on a large data matrix 
* drawn almost entirely with data-ink 
* efficient in interpretation 


* often narrative in content, showing shifts in the relationship 
between variables as the index variable changes (thereby 
revealing interaction or multiplicative effects). 


Small multiples reflect much of the theory of data graphics: The two aphorisms on the meaning of 
“less” are, respectively, credited to Lud- 
For non-data-ink, less is more. wig Mies van der Rohe and to Robert 


Venturi, Complexity and Contradiction in 
Architecture (New York, second edition, 


For data-ink, less is a bore. 
1977), р. 17. 


H К gende, 
Carte Sigur CHOC эв petted ducceddived en omncs WL cade qu (шш Ё coniu? — po ае Anibal ton tepidis 


a Malic on Maversan—Aed Cyaubes (deou Dilybe.). yorclaclangenn 1 Sones айана raison: 2n. mille. 
Deste par ML. Minad, авран, Сыйлай. wa nta oc Chasis carica regi mna irn ei ene 2e qi 
Paria, fL Donee 186g. аа bs alpes jaa capite» case na plas a aic | 


[ч 
Av, 


Her gétes 


Bargusiens 


Carte Figurative наси мыз, al Konie Srangcide dans la- Campagne de e Nu. A812 1813. 
4 Р і > Genial мад а teh молеа) 
hui o MM ی‎ к= nee ee NR E ЧУ С 


orales bounce yelut sou-pepciseutés роко Dangers a omes. соба à aa ison sans willie tes positions; ibo дош— de plus deith cn Menon Е 
dea днев. fa tage liqua t bommes ерге нему ен, e niti De moin cena quiron otii. Шә ohne qui ut deni, à. Pronto da. coat We guide k 
dana dag wage de- MAM. Chiers, асе диг; decPezondac; de Chambray ode jone inin Catal pharmacien 0 Вані depuis Же 28 Octob 
Fowo miena faine jugea Di Aa Dimionution 2e omie; j aupres ue le corps tu Pina Denn e Mariela. Davonor aui austen ML аера dne insk 
о Лино omnea ooo Oca WHR, init joi marche ане aden 


g 
3 


dioas omma de Frana (ob de A de Karan) 


— 26767 XT 


oe 
Aang, pur Rapala, 1 Par. 37 Karia 32 O à Parie. 


9 Aesthetics and Technique in Data Graphical Design 


Along with the amazing graphic of the French losses in the Russiar 
invasion, Minard includes a second “Carte Figurative.” It portrays 
Hannibal’s fading elephant campaign in Spain, Gaul, and Northern 
Italy. Minard uses a light transparent color for flow-lines, allowing 
the underlying type to show through. This refined use of color to 
depict more information contrasts with the garish tones too often 
seen in modern graphics. 

What makes for such graphical elegance? What accounts for 
the quality of Minard's graphics, of those of Playfair and Marey, 
and of some recent work, such as the new view of the galaxies? 
Good design has two key elements: 


Graphical elegance is often found in simplicity 
of design and complexity of data. 


Visually attractive graphics also gather power from content and 
interpretations beyond the immediate display of some numbers. 
The best graphics are about the useful and important, about life 
and death, about the universe. Beautiful graphics do not traffic 
with the trivial. 

On rare occasions graphical architecture combines with the data 
content to yield a uniquely spectacular graphic. Such performances 
can be described and admired but there are no easy compositional 
principles on how to create that one wonderful graphic in millions. 
As Barnett Newman once said, “Aesthetics is for the artist like 
ornithology is for the birds." 

What can be suggested, though, are some guides for enhancing 
the visual quality of routine, workaday designs. Attractive displays 
of statistical information 


+ have a properly chosen format and design 

. use words, numbers, and drawing together 

: reflect a balance, a proportion, а sense of relevant scale 

* display an accessible complexity of detail 

- often have a narrative quality, a story to tell about the data 


- are drawn in a professional manner, with the technical details 
of production done with care 


. avoid content-free decoration, including chartjunk. 


Charles Joseph Minard, Tableaux Gra- 
phiques et Cartes Figuratives de M. Minard, 
1845-1869, a portfolio of his work held 
by the Bibliothéque de l'École Nationale 
des Ponts et Chaussées, Paris. 


178 THEORY OF DATA GRAPHICS 


The Choice of Design: Sentences, Text-Tables, Tables, 
Semi-Graphics, and Graphics 


The substantive content, extensiveness of labels, and volume and 
ordering of data all help determine the choice of method for the 
display of quantitative materials. The basic structures for showing 
data are the sentence, the table, and the graphic. Often two or 
three of these devices should be combined. 

The conventional sentence is a poor way to show more than 
two numbers because it prevents comparisons within the data. 
The linearly organized flow of words, folded over at arbitrary 
points (decided not by content but by the happenstance of column 
width), offers less than one effective dimension for organizing the 
data. Instead of: 


Nearly 53 percent of the type A 
group did something or other 
compared to 46 percent of B and 
slightly morethan 57 percentofC. 


Arrange the type to facilitate comparisons, as in this fext-table: 


The three groups differed in how 
they did something or other: 


Group А 53% 
Group В 46% 
Group С 57% 


There are nearly always better sequences than alphabetical—for 
example, ordering by content or by data values: 


Group B 46% 
Group А 53% 
Group С 57% 


Tables are clearly the best way to show exact numerical values, 
although the entries can also be arranged in semi-graphical form. 
Tables are preferable to graphics for many small data sets.1 A 
table is nearly always better than a dumb pie chart; the only 
worse design than a pie chart is several of them, for then the 
viewer is asked to compare quantities located in spatial disarray 
both within and between pies, as in this heavily encoded example 
from an atlas. Given their low data-density and failure to order 
numbers along a visual dimension, pie charts should never be used. 


е” 


Department of Surveys, Ministry of 
Labour, Atlas of Israel (Jerusalem, 
1956-), vol. 8, p. 8. 


1On the design of tables, see A.S.C. 
Ehrenberg, "Rudiments of Numeracy," 
Journal of the Royal Statistical Society, 
A, 140 (1977), 277-297. 


?'This point is made decisively in Jacques 
Bertin, Graphics and Graphic Information 
Processing (Berlin, 1981). Bertin describes 
multiple pie charts as "completely 
useless" (p. 111). 


How Different Groups Voted for President 


Based on 12,782 interviews with voters at their polling places. Shown is how each group divided 
its vote for President and, in parentheses, the percentage of the electorate belonging to each 


group. 
CARTER-FORD 
CARTER REAGAN ANDERSON in 1976 
Tables also work well when the data is = == ^ =e 
e А 1 li d Independents (23%) 30 54 12 43-54 
presentation requires many loca 12@ Republicans (28%) 11 84 4 9-90 
comparisons. In this 410-number table that I iret M ЖЫНЫ "о de 
H ] Conservatives (28%) 23 71 4 29-70 
designed for the New York Times to show how ae = n = an 
different people voted in presidential elections te emo E Е А 77.22 
А . Я Politically active Democrats (3%) 72 19 8 c 
in the United States, comparisons between the нса using E 
a in primaries (13%) 66 24 8 Ex 
elections of 1980 and 1976 are read across each ex дале б ae = EE 
Я Я + . Я LEE Moderate Independents (12%) 31 53 13 45-53 
line; within-election analysis is conducted by келн EE Gas > a 5 26:02 
1 1 Libera! Republicans (2%) 25 66 9 17-82 
reading downward in the clusters of three to aaa MS ER. А Ta 
Н н р тү Со tive Republ 12%) 6 91 2 6-93 
seven lines. The horizontal rules divide the data БИН $ M $ = 
1 1 + East (32%) 43 47 8 51-47 
into topical paragraphs; the rows are ordered so curs 4s " 8 Ia 
m Midwest (20%) 41 51 6 48-50 
as to tell an ordered story about the elections. ا‎ Gk 2 е an 
i is li “Blacks (10%) — 82 14 °з 82-16 
This type of elaborate table, a supertable, is likely ^ — sete E р 7 кле 
М Я M Whites(8896) 36 55 8 47-52 
to attract and intrigue readers through its fees) PEL. — 
i | i f lik Мае (1% *» и d 50.48 
= е (51%; E 
organized, sequential detail and reference-like Ea ЕРИ 
: Ы dment (22%! 54 32 — 
quality. One supertable is far better than a e daas 
` amendment (15%) 29 66 4 Е 
hundred little bar charts. т n ч = 
Jewish (5%) 45 39 14 64-34 
Protestant (46%) 37 56 6 44-55 
Born-again white Protestant (17%) 34 61 4 = 
18 -21 years old (6%) 44 43 11 48-50 
22 - 29 years old (17%) 43 43 11 51-46 
30 - 44 years old (31%) 37 54 2 49-49 
45 - 59 years old (23%) 39 55 6 47-52 
60 years or older (18%) M _ 40 54 4 47-52 
Family income 
Less than $10,000 (13%) 50 41 6 58-40 
$10,000 - $14,999 (14%) 47 4 8 55-43 
$15,000 - $24,999 (30%) 38 53 7 48-50 
$25,000 - $50,000 (24%) 32 58 8 36-62 
Over $50,000 (5%) 25 65 8 - 
Professional or manager (40%) 33 56 9 41-57 
Clerical, sales or other 
white-collar (11%) 42 48 8 46-53 
Blue-collar worker (17%) 46 47 5 57-41 
Agriculture (3%) 29 66 3 = 
Looking for work (3%) 55 35 7 65-34 
Education - 
High school or less (39%) 46 48 4 57-43 
Some college (28%) 35 55 8 51-49 
College graduate (27%) _ 35 51 11 45 - 55 
Labor union household (26%) 47 44 7 59-39 
No member of household in union (6296) 35 55 8 43-55 
Family finances m 
Better off than a year ago (16%) 53 37 8 30-70 
Same (40%) 46 46 7 51-49 
Worse off than a year ago (34%) 25 64 8 _ E 77- 23 
Family finances and political party 
Democrats, better off 
than a year ago (7%) 77 16 6 69-31 
Democrats, worse off 
than a year ago (13%) 47 39 10 94-6 
Independents, better off (2%) 45 36 12: — 
Independents, worse off (9%) 21 65 11 ex 
Republicans, better off (4%) к 18 77 5 3.97 
Republicans, worse off (11%) 6 89 4 24-76 
More important problem d 
Unemployment (39%) 51 40 7 75-25 
Inflation (4495) 30 60 9 35-65 
Feel that U.S. should be more forceful in 
dealing with Soviet Union even if it would 
increase the risk of war (54%) 28 64 6 ست‎ 
Disagree (31%) 56 32 10 — 
Favor equat rights amendment (46%) 49 38 " — 
Oppose equal rights amendment (35%) 26 68 4 = 
When decided about choice E 
Knew all atong (41%) 47 50 2 44-55 
M During the primaries (13%) 30 60 8 57.42 
New York Times, November 9, 1980, p. A-28. During conventions (8%) 36 55 7 51-48 
Since Labor Day (8%) 30 54 13 49-49 
In week before election (23%) — 38 46 13 49-47 


Source: 1976 and 1980 election day surveys by The New York Times! CBS News Poll and 
1976 election day survey by NBC News. 


180 THEORY OF DATA GRAPHICS 


For sets of highly labeled numbers, a wordy data graphic— 
coming close to straight text—works well. This table of numbers 
` is nicely organized into a graphic: 


Some Winners and Losers In ne Forecasting Game 


2 Economies! 7.4% 
^ Counc of Economic ^ 


Wharton. Economie 


Advisors: 4.7% 3 Б E e * Forecsating: 6.6% 
* de "mU Abouta year ago; eight forecasters werê asked for. ed ence 67% 

tat hn see ТУВ “thelr predictions оп вото key economic Indicators. ene 

2 0 ii н 
5 por «Here's how the forecasts stack up against tha te nid Ai 
probable 1978 TN (shown Inthe black panel). E 

Wharton Econometric J.B.M. Economics 

Forecasting: — 44.59. dis ў Department: 6.6% 
> Congressional Budget _ : f Dato 

Office: +44%® Bed Resources: 6.5% 

NIU PITE ‘Congressional Budget 
© Economiata: 55. am Я 5 Office: 6.3% 


18M. Economics = 3 Ў Wharton Econometric | 
i Department: +5.59% b Lou ^, Forecasting: +21% 5 Advisers: 


Unempioym: 


Corporate Profits 
13.3% Rate: 


irowth: 


Change in Consumer 
Prices: +77% 


Industrial Production 
Growth: .8% 


8 I.B.M. Economics ii Date ü 
2 B Oepertinent: +695 ЕЧ Resources: $101 5% 
с Nat. Ае of Business LB: M. Economics 
7 Economi *6.9* Department +10.4% — — 


Conference Se ;; Chase l 
(Board: 46.2% . Econometrics: +8.5% ` 
делала 
~ Resources: 46.2% 


Chase s 
conometric; +85.9% 


Coonellof Economic. Я 


кш д 
ži Forecasting: € A% i. 


Making Complexity Accessible: Combining Words, 
Numbers, and Pictures 


Explanations that give access to the richness of the data make 
graphics more attractive to the viewer. Words and pictures are 
sometimes jurisdictional enemies, as artists feud with writers for 
scarce space. An unfortunate legacy of these craft-union differences 
is the artificial separation of words and pictures; a few style sheets 
even forbid printing on graphics. What has gone wrong is that the 
techniques of production instead of the information conveyed 
have been given precedence. 

Words and pictures belong together. Viewers need the help that 
words can provide. Words on graphics are data-ink, making 
effective use of the space freed up by erasing redundant and non- 
data-ink. Itis nearly always helpful to write little messages on the 
plotting field to explain the data, to label outliers and interesting 
data points, to write equations and sometimes tables on the graphic 
itself, and to integrate the caption and legend into the design so 
that the eye is not required to dart back and forth between textual 
material and the graphic. (The size of type on and around graphics 


New York Times, January 2, 1979, p. D-3. 


AESTHETICS AND TECHNIQUE 181 


can be quite small, since the phrases and sentences are usually not 
too long—and therefore the small type will not fatigue viewers 
the way it does in lengthy texts.) 

The principle of data [text integration is 


Data graphics are paragraphs about data and 
should be treated as such. 


Words, graphics, and tables are different mechanisms with but a 
single purpose—the presentation of information. Why should the 
flow of information be broken up into different places on the page 
because the information is packaged one way or another? Some- 
times it may be useful to have multiple story-lines or multiple 
levels of presentation, but that should be a deliberate design judg- 
ment, not something decided by conventional production require- 
ments. Imagine if graphics were replaced by paragraphs of words 
and those paragraphs scattered over the pages out of sequence with 
the rest of the text—that is how graphical and tabular information 
is now treated in the layout of many published pages, particularly 
in scientific journals and professional books. 
Tables and graphics should be run into the text whenever pos- 
sible, avoiding the clumsy and diverting segregation of “See Fig. 
2," (figures all too often located on the back of the adjacent page).? 3 Fig." often used to refer to graphics, 
If a display is discussed in various parts of the text, it might well is an ügly abbreviation ind suck worth 
А б 5 the two spaces saved. 
be printed afresh near each reference to it, perhaps in reduced 
size in later showings. The principle of text/graphic/table inte- 
gration also suggests that the same typeface be used for text and 
graphic and, further, that ruled lines separating different types of 
information be avoided. Albert Biderman notes that illustrations 
were once well-integrated with text in scientific manuscripts, such 
as those of Newton and Leonardo da Vinci, but that statistical 
graphics became segregated from text and table as printing tech- 
nology developed: 


The evolution of graphic methods as an element of the scientific 
enterprise has been handicapped by their adjunctive, segre- 
gated, and marginal position. The exigencies of typography 
that moved graphics to a segregated position in the printed 
work have in the past contributed to their intellectual segre- 
gation and marginality as well. There was a corresponding 
organizational segregation, with decisions on graphics often 
passing out of the hands of the original analyst and communi- 
cator into those of graphic specialists—the commercial artists 
and designers of graphic departments and audio-visual aids x | а 
Я К s ert D. Biderman, “The Graph as a 
shops, for example, whose predilections and skills are usually Visto er А Онал dnd 
more those of cosmeticians and merchandisers than of scientific Segregation,” Information Design Journal, 
analysts and communicators.* 1 (1980), 238. 


182 THEORY OF DATA GRAPHICS 


Page after page of Leonardo's manuscripts have a gentle but 
thorough integration of text and figure, a quality rarely seen in 
modern work: 


1234: 
chevat le core utdute субе rant sntaíttee che пб che : 
lemembva ma cate quas ipara imposibile a fo 
т figurare Соте self occhio fatyse, a, ela fus. dust 

„дид di braccia сдил& alla ma таший. difta 
Sia, A, аА o, n. dalochia mee 
bvacte allava tu ucdyat per csfospa- 


cto cute le coe de bu a. uter riot аит 
пои lungha o. dans orante d 


conto might гї tana confusa. Atminuttare bend 
che figuras de quelle altuna farte < habbia opta 
та apena potrti povre sipicolo punto di penello che 
HON HA MAZIONE c'hagnt gran’ casamento fosta "m 
dita’ nighn A distantia. 
Perche fé mana vit Lengha atrium 
sidimostrano pin scant nella ama 
che nella basa — 


{ pe сайи gadi di grosega IN ogm graded: 


la sua баў а edola sua distantia е'сдиуа сї 
ame de mont chr pia stn algane fin martrana la. 
suna natu rale osu- 
rita fev "E: с ^ che maco 
sons impe- f Daz ==, dite daly 
MOY AONE x helow 
stla ama ? > che nella 


re basa o nella mamir che nella vemomane, bre. 
HAST, b, fs dass, CV ak sto gradi della hese 
re suscteylian' quanta pin salanê , af, fh, Gh, 
soño й ‘altri quada trancuarah dove (aria agii 


Finally, a caveat: the use of words and pictures together requires 
a special sensitivity to the purpose of the design—in particular, 
whether the graphic is primarily for communication and illus- 
tration of a settled finding or, in contrast, for the exploration of 
a data set. Words on and around graphics are highly effective— 
sometimes all too effective—in telling viewers how to allocate 
their attention to the various parts of the data display.5 Thus, for 
graphics in exploratory data analysis, words should tell the viewer 
how to read the design (if it is a technically complex arrangement) 
and not what to read in terms of content. 


Leonardo da Vinci, Treatise on Painting 
[Codex Urbinas Latinus 1270], vol. 2, 
facsimile (Princeton, 1956), p. 234, 
paragraph 827. 


5 Experiments in visual perception indi- 
cate that word instructions substantially 
determine eye movements in viewing 
pictures. See John D. Gould, "Looking 
at Pictures," in Richard A. Monty and 
John W. Senders, eds., Eye Movements 
and Psychological Processes (Hillsdale, N.J., 
1976), 323-343. 


AESTHETICS AND TECHNIQUE 183 


Accessible Complexity: The Friendly Data Graphic 


An occasional data graphic displays such care in design that it is 
particularly accessible and open to the eye, as if the designer had 
the viewer in mind at every turn while constructing the graphic. 
This is the friendly data graphic. 

There are many specific differences between friendly and 


unfriendly graphics: 
Friendly 


words are spelled out, mysterious and 
elaborate encoding avoided 


words run from left to right, the 
usual direction for reading occidental 
languages 


little messages help explain data 


elaborately encoded shadings, cross- 
hatching, and colors are avoided; 
instead, labels are placed on the graphic 
itself; no legend is required 


graphic attracts viewer, provokes 
curiosity 


colors, if used, are chosen so that the 
color-deficient and color-blind (5 to 
10 percent of viewers) can make sense 


Unfriendly 


abbreviations abound, requiring the 
viewer to sort through text to 
decode abbreviations 


words run vertically, particularly along 
the Y-axis; words run in several 
different directions 


graphic is cryptic, requires repeated 
references to scattered text 


obscure codings require going back 
and forth between legend and graphic 


graphic is repellent, filled with 
chartjunk 


design insensitive to color-deficient 
viewers; red and green used for 
essential contrasts 


of the graphic (blue can be distin- 
guished from other colors by most 
color-deficient people) 


type is clear, precise, modest; lettering type is clotted, overbearing 


may be done by hand 


type is upper-and-lower case, with type is all capitals, sans serif 


serifs 


With regard to typography, Josef Albers writes: 


The concept that “the simpler the form of a letter the simpler 
its reading” was an obsession of beginning constructivism. It 
became something like a dogma, and is still followed by 
"modernistic" typographers. . .. Ophthalmology has disclosed 
that the more the letters are differentiated from each other, the 
easier is the reading. Without going into comparisons and 
details, it should be realized that words consisting of only 
capital letters present the most difficult reading—because of 
their equal height, equal volume, and, with most, their equal 
width. When comparing serif letters with sans-serif, the latter 
provide an uneasy reading. The fashionable preference for 
sans-serif in text shows neither historical nor practical 
competence.® 


Josef Albers, Interaction of Color (New 
Haven, 1963, revised edition 1975), p. 4. 


184 THEORY OF DATA GRAPHICS 


Proportion and Scale: Line Weight and Lettering 


Graphical elements look better together when their relative pro- 
portions are in balance. An integrated quality, an appropriate 
visual linkage between the various elements, results. This musical 
score of Karlheinz Stockhausen exhibits such a visual balance: 


(D ايور‎ 


4 m کسر‎ a! = ч 


! 

1 

t 
|! 
Т Ы Mischungen 

In contrast, this next design is heavy handed, with nearly every 
element out of balance: the clotted ink, the poor style of lettering, 
the puffed-up display of a small data set, the coarse texture of the 
entire graphic, and the mismatch between drawing and sur- 
rounding text: 


70% 


60% 
е 
РЕНЕ is ©. Actual result: 
SEATS © | Democrats received 
| 50.9% votes, 
55.4% seats 
50% Ra a oT کے‎ 
e 
40% m SX YA 


DEMOCRATIC SHARE OF VOTE 


Figure 4. Seats and Votes in 1968. 


Karlheinz Stockhausen, Texte, vol. 2 
(Cologne, 1964), p. 82, from the score 
of "Zyklus für einen Schlagzeuger.” 


Edward R. Tufte, “The Relationship 
Between Seats and Votes in Two-Party 
Systems," American Political Science 
Review, 67 (June 1973), 551. 


AESTHETICS AND TECHNIQUE 185 


Lines in data graphics should be thin. One reason eighteenth- 
and nineteenth-century graphics look so good is that they were 
engraved on copper plates, with a characteristic hair-thin line. 
The drafting pens of twentieth-century mechanical drawing 
thickened linework, making it clumsy and unattractive. 

An effective aesthetic device is the orthogonal intersection of 
lines of different weights: 


Poster for the exhibition “Mondrian and 
Neo-Plasticism in America,” Yale Uni- 
versity Art Gallery, October 18 to 
December 2, 1979. The original painting 
was done in 1941 by Diller; see Nancy 
J. Troy, Mondrian and Neo-Plasticism in 
America (New Haven, 1979), p. 28. 


Nearly every intersection of the lines in this design (based on a 
painting by Burgoyne Diller) involves lines of differing weights, 
and it makes a difference, for the painting’s character is diluted 
with lines of constant width: 


186 THEORY OF DATA GRAPHICS 


Likewise, data graphics can be enhanced by the perpendicular 
intersections of lines of differing weights. The heavier line should 
be a data measure. In a time-series, for example: 


The contrast in line weight represents contrast in meaning. T he 
greater meaning is given to the greater line weight; thus the data 
line should receive greater weight than the connecting verticals. 
The logic here is a restatement, in different language, of the 
principle of data-ink maximization. 


Proportion and Scale: The Shape of Graphics 


Graphics should tend toward the horizontal, greater in length 
than height: 


lesser height 


greater length 


Several lines of reasoning favor horizontal over vertical displays. 
First, analogy to the horizon. Our eye is naturally practiced in 

detecting deviations from the horizon, and graphic design should 

take advantage of this fact. Horizontally stretched time-series 

are more accessible to the eye: 


The analogy to the horizon also suggests that a shaded, high con- 
trast display might occasionally be better than the floating snake. 
The shading should be calm, without moiré effects. 


Aio 


Second, ease of labeling. It is easier to write and to read words 
that read from left to right on a horizontally stretched plotting- 
field: 


some 


labels 


some labels ; 
instead of 


some other labels 
some 


other 


labels 


Third, emphasis on causal influence. Many graphics plot, in essence, 


effect 


Cause 


and a longer horizontal helps to elaborate the workings of the 
causal variable in more detail. 


AESTHETICS AND TECHNIQUE 


187 


12 


188 THEORY OF DATA GRAPHICS 


Fourth, Tukey’s counsel. 


Most diagnostic plots involve either a more or less definite 
dependence that bobbles around a lot, or a point spatter. Such 
plots are rather more often better made wider than tall. Wider- 
than-tall shapes usually make it easier for the eye to follow 
from left to right. 

Perhaps the most general guidance we can offer is that 
smoothly-changing curves can stand being taller than wide, 
but a wiggly curve needs to be wider than tall. . . .7 


And, finally, Playfair’s example. Of the 89 graphics in six dif- 
ferent books by William Playfair, most (92 percent) are wider than 
tall. Several of the exceptions are his skyrocketing government 
debt displays. This plot shows the dimensions of each of those 
89 graphics: 


Graphic is taller than wide 


7John W. Tukey, Exploratory Data 
Analysis (Reading, Mass., 1977), p. 129. 


Graphic is wider than tall 


Height (inches) Graphic is square 


DeUT PEDO HCCC UCT L PETER EURO T US TT OTA TAL EL OLOP ON EAT гегеъ вазона оет ө 


Each plotted point represents the 
upper right-hand corner of one of 
Playfair’s graphics; for example 


AESTHETICS AND TECHNIQUE 189 


If graphics should tend toward the horizontal rather than the ver- ®The combination of geometry and 
tical, then how much so? A venerable (fifth-century в.с.) but И ео Rec: 
] 1 VR i m tangle can be seen in Miloutine Boris- 
dubious rule of aesthetic proportion is the Golden Section, a “di- savlitvitch, The Golden Number and the 
vine division" of a line.8 A length is divided such that the smaller Scientific Aesthetics of Architecture (New 
А н York, 1958) and Tons Brunés, The Se- 
is to the greater part as the greater is to the whole: crets of Ancient Geometry (Copenhagen, 
1967), vols. 1 and 2. 
a b 
ا‎ 
a _ b 
b a+b 
: : : 5+1 
Solving the quadratic when а = 1 yields b = үз + = 1.618.... 
2 
In turn the Golden Rectangle is 
1.0 
1.618... 


The пісе geometry of the Golden Rectangle is not unique; 
Birkhoff points out that at least five other rectangles (including 


the square) have one simple mathematical property or another for ? George D. Birkhoff, Aesthetic Measure 
which aesthetic claims might be made:? (Cambridge, 1933), pp. 27-30. 
= І y= 1414 т = 1.618 7 = 1.732 


Playfair favored proportions between 1.4 and 1.8 in about two- 
thirds of his published graphics, with most of the exceptions Golden Rectangle 
moving more toward the horizontal than the golden prescription: 


190 THEORY OF DATA GRAPHICS 


Visual preferences for rectangular proportions have been studied 


by psychologists since 1860, but, even given the implausible as- 


sumption that such studies are relevant to graphic design, the find- 


ings arc hardly decisive. A mild preference for proportions near 
to the Golden Rectangle is found among those taking part in the 
experiments, but the preferred height/length ratios also vary a 
great deal, ranging between 


And, as is nearly always the case in experiments in graphical 
perception, viewer responses were found to be highly context- 
dependent.1° 


The conclusions: 


* If the nature of the data suggests the shape of the graphic, 
follow that suggestion. 


* Otherwise, move toward horizontal graphics about $0 percent 
wider than tall: 


101 have relied on Leonard Zusne, Visual 
Perception of Form (INew York, 1970), 
ch. 10, for a summary of the immense 
literature. 


Epilogue: Designs for the Display of Information 


Design is choice. The theory of the visual display of quantitative 
information consists of principles that generate design options and 
that guide choices among options. The principles should not be 
applied rigidly or in a peevish spirit; they are not logically or mathe- 
matically certain; and it is better to violate any principle than to 
place graceless or inelegant marks on paper. Most principles of 
design should be greeted with some skepticism, for word authority 
can dominate our vision, and we may come to see only through 
the lenses of word authority rather than with our own eyes. 

What is to be sought in designs for the display of information 
is the clear portrayal of complexity. Not the complication of the 
simple; rather the task of the designer is to give visual access to 


the subtle and the difficult—that is, 


the revelation of the complex. 


Index 


194 INDEX 


aesthetics, graphical 177-191 

air pollution 42, 170 

Akahata ("Red Flag") 83 

Albers, Josef 183 

Alder, Berni 134, 145 

American Education 118 

American Political Science Review 167 
Annuaire Statistique de la France 167 
Anscombe, F. J. 14 

Anscombe’s quartet 13-14 

area and quantity 69-73 

Arkin, Herbert 112 

Asahi 83, 167, 168 

Asch, S. Б. 56 

astronomical graphics 26—29, 154, 166 
Ayres, Leonard P. 141, 155 

Ayres, Richard E. 95 


Bamburger, Clara Francis 82 
Bancroft, T. A. 53, 140 

bar chart 96-97, 126-128, 129 
Barabba, Vincent P. 153 
Barnett, Vic 114, 168 
before-after time-series 39 
Beniger, James R. 20, 86 
Bertin, Jacques 112, 166, 169, 178 
Bickmore, D. P. 162 
Biderman, Albert D. 181 

‘Big Duck’ 116, 117 

bilateral symmetry 97 
Biochemistry 110 

Biochimica et Biophysica Acta 110 
Birkhoff, George О. 189 

blot maps 20, 139 

Blot, William J. 16 

Bonner, John Tyler 94 

boring statistics 79-80, 87 
Borissavliévitch, Miloutine 189 


Boutique Data Graphics 128 
Bowen, William G. 95 

box plot 97, 123, 129 

Brier, Stephen S. 14 

Brinton, Willard C. 112 
Brown, Denise Scott 116—117 
Brunés, Tons 189 

Bryan, Kirk. 99 

Business Week 63, 83, 167 


calculus 9 

calculus, graphical 46 

California Water Atlas 119 

Campbell, Angus 147 

Campbell, Donald T. 7$ 

cancer 16-20, 47, 171 

cancer maps 16-20, 121 

Cappon, Lester J. 146 

cars, frequency of repair 174 

Cartesian coordinates 9 

cause and effect 37, 47, 82, 187 

СЕМ 56 

chartjunk 107-121 

Chavannes, Е. 21 

chemical elements 102-105 

Chernoff faces 97, 142 

Chernoff, Herman 136, 142 

Cheysson, Ё. 154 

chicken, 4,340-pound 73 

cholera map 24 

chromosomes 172-173 

Cleveland, William S. 38, 168 

color 153-1$4, 183 

color-deficient viewers 183 

Colton, Raymond R. 112 

computer graphics 26-27, 29, 42, 112, 
116, 120-121, 136, 153, 170 

Connecticut speeding 74-75 


constant dollars 65-68 
Consumer Reports 80, 174 
context 74-75 

Converse, Philip E. 147 
copper, conductivity of 49 
Cornford, Eain M. 109 
Cosmographia 22 

county maps 16-20, 153 
Cox, Michael D. 99 
Crotty, William J. 9s 
Curti, Merle 85 


Dahl, Robert А. 85 

Dakin, Edwin F. 15 

Daniel, Cuthbeft 133 

data-based grid 148 

data-based labels 149—152 

data-built data-mmeasures 139-144 

data density 161-169 

data-ink 91-105 

data-ink maximization 96-105, 
123-137, 175, 186, 187 

data-ink ratio 93-96 

data maps 16-27 

data matrix 167-169 

data measures 139-144 

data/text integration 180-182 

data variation 60-61 

Davis, John C. 162 

Day Mines, Inc. 54 

decoder, visual 153-154 

deflating money 65-68 

Der Spiegel 83 

design variation 60-63 

Dewey, Edward R. 15 

Die Zeit 83 

Diller, Burgoyne 185 

distortion, graphical 55-59 

dogs 50 


Doll, R. 47, 82 

dollars, constant 65-68 
dot-dash-plot 133, 136, 139, 132 
double-functioning labels 149-152 
Duchamp, Marcel 36 

ducks 116-121 


‘Easter Wings’ 143 

economic data 15, 32-34, 38, 54-55, 
61-68, 70, 91-92, 108, 126, 148, 152, 
158, 161, 180 

The Economist 63, 83, 167 

educational tests, graphics 86 

Ehrenberg, A. S. С. 178 

election graphics 147, 179 

electroencephalogram 93 

elegance, graphical 177 

Eliot, T. S. 100 

encodings 178, 183 

erasing principles 96-100, 136 

Erbring, Lutz 120 


excellence, graphical 13-51 


Fienberg, Stephen E. 14 
‘Fig? 181 

Finkner, Alva L. 153 
Fiorina, Morris 66 

Fisher, В. А. 133 

Flanigan, William H. 85 
Flury, Bernhard 97 

Fortune 167 

Francolini, C. M. 253 
Frankfurter Allgemeine 83 
Fraumeni, Joseph Е. 16 
French communes 166 
French wine map 25-26 
friendly data graphics 183 
fuel economy graphic 57-59 
Funkhouser, Н. Gray 20, 28, 154 


Gabaglio, Antonio 72 

galaxies map 26-27, 154, 166 
gecko 36 

Geological Survey maps 168 
Gilbert, E. W. 24 

Goldenberg, Edie М. 120 
Goodin, W. R. 170 

Gould, John D. 182 
government spending 64-68, 158 
grade school graphics 86 

Graph of Magical Parallelepipeds 68 
Graphical Hack 59 

gray grid 116 


gray shades 154 
grids 112-116 
grids, data-based 
grids, gray 116 
grids, white 127-129, 136 
Groth, Edward J. 26 
Gurnett, Donald А. 29 


145-148 


half-face 97, 136 

Halley, Edmond 23 

Hankins, Timothy Н. 134 
Hannibal's campaign 176-177 
Hayward, Roger 102, 136 
Herbert, George 143 

Herdeg, Walter 80, 152 

herring catches 172 

hierarchy in graphics 153-159 
high-information graphics 168-169 
Hilgard, Ernest 85 

histogram 126-128 

histogram, living 140 

Hjort, Johan 172 
Ho, C. Y. 49, 150 
Hoover, Robert 16 
horizon, analogy to 
horse paces 34-35 
House of Representatives 37 
Huot, Marie E. 109 


186-187 


identification numbers 149-150 
Israel, Atlas of 178 

Italian post office 72 

Izenour, Steven 117 


Japan, graphics in university entrance 
exams 86 

Japanese beetle 43, 121 

Japanese graphics 82-84 

JASA style sheet 110, 150-151, 164 

Joiner, Brian L. 140 

Journal of the American Chemical 
Society 110 

Journal of American Medical Association 
167 

Journal of the American Statistical 
Association 110, 150—151, 164, 
167-168 

Journal of Biological Chemistry 110 

Journal of Chemical Physics 110 

Journal of the Royal Statistical Society 167 

Jupiter graphic 29 


Kahrl, William L. 119 
Kelley, Stanley 95 


INDEX 195 


Kolers, Paul A. 79 

Kooi, Kenneth 93 

Kouchoukos, Nicholas 109 

Kuznicki, James Т. 100, 109 

labels on graphics 180-182, 183, 187 

Lambert, Johann Heinrich 29, 32, 
45-46 

Lancet 110 

lane stripes 144 

Larsen, Wayne A. 129 

Lauer, A. R. 144 

Law School Admission Test, 
graphics 86 

Learning from Las Vegas 

Leonardo da Vinci 

less is a bore 175 

less is more 175 

lettering 184 

Lie Factor 57 

lies, defense of 76-77 

Liley, Р.Е. 49, 150 

line weight 185 

living histogram 140 

Long, John Hamilton 146 

Los Angeles smog 42, 170 

Los Angeles Times 42, 69 

lung cancer 16-20, 47 

lying graphics 76-77 


116-117 
181-182 


Macdonald-Ross, Michael 56 

MacGregor, A.J. 112 

magnetic monopole 39 

mail, House of Representatives 37 

Malcolm, Andrew H. 84 

Manvel, Allen D. 81 

map, blot 20, 139 

map, cancer 16-20 

map, cholera 24 

map, data 16-27 

map, French wine 25-26 

map, galaxies 26-27, 154, 166 

map, Israel 178 

map, patch 20 

map, thematic 16-27 

map, Үй Chi Thu 20-21 

Marey, E.J. 31, 34-36, 40, 115-116, 121 

Mason, Thomas J. 16 

Masterton, William 85 

Matisse, Henri 160 

maximizing data-density 168 

maximizing data-ink 96-105, 123-137, 
175, 186, 187 

McClenaghan, William 85 


196 INDEX 


McCracken, Paul 48 

McCullagh, Michael J. 162 

McCutcheon, N. Bruce 100, 109 

McGill, Robert 129 

McKay, Frank W. 16 

McRae, Gregory J. 42, 170 

Meihoefer, Н. J. 56 

melanoma 171 

Mies van der Rohe, Ludwig 175 

Miller, Arthur H. 120 

Miller, Warren E. 147 

Minard, Charles Joseph 24-25, 39, 
40-41, 51, 121, 176-177 

Mitchell, Н.І. 50 

moiré vibration 107-112 

Le Monde 83, 167 

Mondrian, Piet 185 

Monkhouse, F. J. 112, 165 

Monty, Richard A. 168, 182 

Moore, Carol 152 

Moscow, Napoleon’s campaign 40-41 
121, 176-177 

mountain-to-the-sea 154 

multifunctioning elements 139-159 

multiple viewing angles 154-155 

multiple viewing depths 154-155 

multiwindow plots 114, 171 

Murphy, Edmond A. 172 

Museum of Modern Art 91 


Napoleon’s march 40-41, $1, 121, 
176—177 

narrative graphics 40-43 

National Science Foundation 60 

Nature 110, 167 

Needham, Joseph 20 

New England Journal of Medicine 110, 
167 

Newman, Вагпе 177 

Newman, L, Hugh 43 

Newsweek 167 

Newton, Isaac 181 

New York City weather 30, 121, 164 

New York Times 30, $4, 57, 61, 66, 74, 
76, 80-83, 86, 121, 164, 167, 179, 180 

Nihon Keizai 83 

non-data-ink 96, 107 

Nude Descending a Staircase 36 


ocean currents 99 

oil prices 61-63 

optical art 107-108 
optical dots 114 
orthogonal lines 185—186 
outliers 142 


parallel schematic plot 125 

patch maps 20 

Pauling, Linus 85, 102 

Pearson, Karl 145 

Peebles, P. James E. 26 

periodicity of elements 102-105 

perpendicular lines 185-186 

Petchenik, Barbara Bartz 146 

Phillips curve 48 

pie chart 178 

Pittsburgh Civic Commission $$ 

Playfair, William 9, 32-34, 43-45, 52, 
64-65, 73, 91-92, 126, 148, 188-189 

population density map 155-157 

Potter, D. E. 145 

Powell, R. W. 49, 150 

Prakash, Om 172 

Pravda 83, 167 

Pravda School of Ordinal Graphics 76 

presidential vote graphics 147, 179 

press, graphical sophistication 82-84 

Proceedings of the National Academy of 
Sciences, U.S.A. 110 

Psychological Bulletin 167 

psychological experiments 

Pugin 117 

puzzle graphics 153 


55-56 


quartile plot 124, 136, 139 
range-frame 130, 136, 139, 149, 152 
redundant data-ink 96-100 
Reinhardt, Ad 122 

Reis, Albert J. 14 
relational graphics 43-50 
Rhóne bridge 39 

Rickett, Barney ]. 134 
Riedwyl, Hans 97 

Riley, Bridget 108 

road stripes 144 

Roberts, К. V. 145 
Robinson, Arthur H. 20, 40 
Robyn, Dorothy L. 20 
Rogers, Anna С. 112 

Ross, Н. Laurence 75 
royalty, genealogy 34-35 
rugplot 135, 136 

Ruskin, John 117 


sampling error 172 
Sampson, Roy J. 85 
Samuelson, Paul 85 
sans serif 183 
SAS/GRAPH 112 


Satet, R. 69, 112 

Sato, Isao 82, 85 

Scarf, F. L. 29 

scatterplot 130-135 

Schlee, Susan 172 

Schmid, Calvin F. 112 
Schmid, Stanton E. 112 
Science 39, 83, 109, 110, 167, 173 | 
Science Indicators, 1974 бо 
Scientific American 50, 73, 167 
sea-horse 36 | 
Seinfeld, J. Н. 170 

Seldner, Michael 26 
semi-graphics 178-180 


Senders, John W. 168, 182 
Shahn, Ben 177 
shape, graphical 186-190 


Shinohara, Miyohei 

Shiskin, Julius 38 

Shrink Principle 169 

Shryock, Henry 5. 113 

Siebers, B. H. 26 

Siegel, Jacob S. 113 

Silverstein, Louis 80 

Slowinski, Emil 85 

small multiple 42, 48, 170-175 

smog 42, 170 

Snow, Dr.John 24 

Social Indicators, 1973 163 

sophistication, graphical 82-86 

Spear, Mary Eleanor 81, 112, 123 

starfish 36 

Statistical Abstract of the United States 
161, 167 

stem-and-leaf plot 140 

Stockhausen, Karlheinz 184 

stock prices 15 

Stokes, Donald E. 147 

strangers 142 

Strunk, William 81 

Sunday Times (London) 63 

sunshine, British 165 

supertable 179 

Swift, Jonathan 106 

Sypher, Wylie 139 


82, 85 


table-graphic 145, 158-159 
tables 56, 178-180 

Tanur, Judith 85 

television graphics 76, 81 
Tell-A-Graf 112 

Terpenning, Irma J. 38 
textbooks 84-85 

textbooks, statistical graphics 112 


text-tables 178-180 Tukey, Paul A. 
Thissen, David 142 

Thomas, Edward Llewellyn 168 
Thomas, LaVerne 85 

Thompson, Morris M. 168 
Thrower, Norman J. W. 23 
Thurber, James 52 

tick marks 127-129, 149 

Tiling, Laura 32, 45-46 

Time 62, 71, 79, 83 

The Times (London) 83, 167 
time-series 28-39 

time-series, before-after 39 

Todd, Lewis Paul 85 

train schedule 31, 98, 115-116, 121 


Vasarely, Victor 
Vauthier, L. L. 
Venturi, Robert 


Tufte, Edward R. 29, 53, 75, 95, 184 
Tukey, John W. 53, 114, 123, 125, 


129, 136, 140, 168, 188 white pines 50 


The following illustrations are reprinted by permission: 

CHAPTER 1. Anscombe’s quartet © 1973 American Statistical Asso- 
ciation. Map of the galaxies, P. James E. Peebles, Princeton Uni- 
versity. Radio emissions of Jupiter, Donald A. Gurnett, © 1979 
American Association for the Advancement of Science. New York 
City weather history © 1981 The New York Times Company. 
Text on franked mail © 1975 The New York Times Company. 
Magnetic monopole © 1982 American Association for the Advance- 
ment of Science. Bridge on Rhóne, Bibliothéque de l'École 
Nationale des Ponts et Chaussées. Los Angeles smog, by Bob Allen 
€ 1979 The Los Angeles Times, based on data from Gregory J. 
McCrae, California Institute of Technology. Japanese beetle © 1965 
Aldus Books, London. Phillips curve, Organisation for Economic 
Cooperation and Development, Paris. Dogs © 1976 Scientific 
American, Inc. CHAPTER 2. Payments to travel agents © 1978 
The New York Times Company. Drawing by CEM © 1961 The 
New Yorker Magazine, Inc. Fuel economy standards © 1978 The 
New York Times Company. OPEC oil prices © 1978 The New 
York Times Company. Oil prices © 1979 Time, Inc. Oil prices © 
1979 The Washington Post. Real price of oil, Business Week, © 
1979 McGraw-Hill, Inc. Real price of oil © 1979 The Sunday 
Times, London. Real price of oil © 1979 The Economist. Growth 
of government © 1977 Yale University Press. New York budget 
© 1976 The New York Times Company. Shrinking doctor, by 
Bob Allen and Pete Bentevoja, © 1979 The Los Angeles Times. 
Shrinking dollar © 1978 The Washington Post. Picture of oil der- 
ricks © 1981 The New York Times Company. CHAPTER 3. Views 
on economy © 1980 The New York Times Company. Pace of city 
life © 1976 The New York Times Company. CHAPTER 4. Elec- 
troencephalogram © 1971 Harper & Row. Generation time © 1965 
Princeton University Press. Registration rates © 1967 American 
Political Science Association. Registration rates © 1970 Addison- 


114, 168 
tungsten, conductivity of 150 
typography 178-183 


Valéry, Paul 153 


117, 139, 175 
viewing architecture 159 


INDEX 197 


Wilk, Martin В. $3 
Wilkinson, H.R, 112, 165 
Wilson, James Q. 85 
Wiskenmann, Arthur 171 
World War I graphic 141 
Wurman, Richard Saul 90 


Yajima, Yokichi 85 
Youden, W. J. 143 
Үй Chi Thu map 20-21 
Yunis, Jorge J. 172 


Wainer, Howard 79, 142, 153 
Waldrop, M. Mitchell 39 
Wall Street Journal 83, 167-168 
Washington Post 62, 70, 83 
Troy, Nancy J. 185 White, E. B. 81 

white grid 127-129, 136 
White, Jan 79-80 


Zeeman, E. С. so 
Zingale, Nancy H. 85 
Zusne, Leonard 97, 190 


Wesley Publishing Company. Registration rates © 1970 Holt, 
Rinehart and Winston. Ocean currents, Kirk Bryan, Princeton 
University and the American Meteorological Society. Bar chart, 
taste experiment, James T. Kuznicki, Proctor & Gamble Company. 
Roger Hayward drawing of periodic properties of elements, Linus 
Pauling. CHAPTER $. Vertical bars, glucose transfer, Eain M. 
Cornford, € 1981 American Association for the Advancement of 
Science, Multiwindow plot, Paul A. Tukey. Big Duck, Peter Blake. 
California crops, State of California, Office of Planning and Research. 
Newspaper content € 1979 American Political Science Association. 
CHAPTER 6. Parallel schematic plot € 1977 Addison-Wesley Pub- 
lishing Company. Box plot variation © 1978 American Statistical 
Association. Pulsar signals © 1975 Academic Press. Empirical cumu- 
lative distribution € 1976 John Wiley. CHAPTER 7. Stem-and-leaf 
plot © 1972 The Iowa State University Press, Living histogram, 
Brian Joiner. Magnetohydrodynamics © 1970 Academic Press. 1783 
map © 1976 Princeton University Press. 1956 and 1960 elections © 
1966 John Wiley. CHAPTER 8. French communes € 1973 Mouton 
and Gauthier-Villars. Bertin small multiple © 1973 Mouton and 
Gauthier-Villars. Melanoma © 1972 Springer-Verlag. Normal dis- 
tributions © 1964 Pergamon Press, Ltd. Chromosomes, Jorge J. 
Yunis, © 1982 American Association for the Advancement of 
Science. Automobile frequency of repair © 1982 Consumers Union 
of the United States, Inc. CHAPTER 9. How different groups 
voted for president © 1980 The New York Times Company. Fore- 
casting © 1979 The New York Times Company. Leonardo da Vinci 
drawing, facsimile © 1956 Princeton University Press. Musical 
score, Karlheinz Stockhausen, © 1964 Verlag M. DuMont Schauberg. 
Seats-votes graph © 1973 American Political Science Association. 
Poster illustration based on Burgoyne Diller, Geometrical Composi- 
tion in Black, Red, and White, Yale University Art Gallery, 
Collection Société Anonyme. 


Acknowledgments, Second Edition 


In preparing this new edition, I am grateful to Michael Arsenault, 
Michael and Winifred Bixler, Nicholas Cox, Inge Druckrey, 
Howard Gralla, Graham Larkin, MaryBeth Uryga, Carolyn Williams, 
and to GHP . 

The wonderful staff of Graphics Press continues with their very 
special care: Kate Atkinson, Karen Bass, Cynthia Bill, Donna Karosi, 
Elaine Morse, Kathy Orlando, Peter Taylor, and Carolyn Williams. 


January 2001 
Cheshire, Connecticut 


Acknowledgments 


I am indebted to many for their advice and assistance with this book. 

For leave and research support during several academic years, 
the Center for Advanced Study in the Behavioral Sciences, the 
John Simon Guggenheim Foundation, the Woodrow Wilson School 
of Princeton University, and Yale University. 

For providing access to their superb collections, the Bibliothéque 
Nationale and the Bibliothèque de l'École Nationale des Ponts et 
Chaussées in Paris, and, at Yale University, the Historical Medical 
Library and the Beinecke Rare Book and Manuscript Library. 

For helping me appreciate the practicalities in the production 
of statistical graphics, several members of the art department at the 
New York Times and my students in Graphic Design and in the 
Department of Statistics at Yale. 

For assistance in establishing Graphics Press, Peter B. Cooper, 
Earle E. Jacobs, Jr., and Trudy Putsche. 

For design and artwork, Howard I. Gralla and Minoru Niijima. 

For their help and hospitality in Paris during my work on the 
Minard drawings, Michel Balinski, Jean Dubout, André Jammes, 
and Claudine Kleb. 

For providing examples and for suggesting improvements in the 
manuscript, James Beniger, Inge Druckrey, Timothy Gregoire, 
Joanna Hitchcock, Joseph LaPalombara, Kathryn Scholle, Stephen 
Stigler, Howard Wainer, and Ellen Woodbury. 

For their reviews of the manuscript and for their inspiration and 
encouragement through all the years of this enterprise, Frederick 
Mosteller and John W. Tukey. 


June 1982 
Cheshire, Connecticut 


